1.
提供的资源包括:
Word vectors
Word vectors were induced from PubMed and PMC texts and their combination using the tool. The word vectors are provided in the word2vec binary format.
//其他两个用不到,没有看。
在Pubmed上接近23百万的摘要;在PMC70万的全文文章。
Wiki-PubMed-PMC是以上两个再加上4百万的英文维基百科文章。
PubMed Central® (PMC) is a free full-text archive of biomedical and life sciences journal literature at the U.S. National Institutes of Health's National Library of Medicine (NIH/NLM).
2.
有很多金标准数据集,也就是人工标注的。
每个预料集都包含对应的实体标注类别。
3.