With the rise of digital humanities, historians explore how to intellectually engage with textual sources given the available computational tools of today. The ENP-China project employs Natural Language Processing methods to tap into sources of unprecedented scale with the goal to study the transformation of elites in Modern China (1830-1949). One of the subprojects is extracting various kinds of data from biographies and, for that, we created a large corpus of biographies automatically collected from the Chinese and English Wikipedia. The dataset contains 228,144 biographical articles from the offline Chinese Wikipedia copy and is supplemented with 110,713 English biographies that are linked to a Chinese page. We also enriched this bilingu...
Diving into a first dataset A major objective of the ENP-China project is to bring together expert...
This paper explores automatic methods to identify relevant biography candidates in large databases, ...
International audienceWe add to the literature on notable individuals (famous, prominent, distinguis...
With the rise of digital humanities, historians explore how to intellectually engage with textual so...
In the last few months we tried to build a corpus based on the biographies of the Chinese Wikipedia....
In the previous blog post, we described how we retrieved Chinese biographies by the use of a machine...
Difangzhi (地方志) is a large collection of local gazetteers complied by local govern-ments of China, a...
It is arguable whether history is made by great men and women or vice versa, but undoubtably social ...
This paper presents the idea and our work of extracting and reassembling a genealogical network auto...
International audienceGenerating factual, long-form text such as Wikipedia articles raises three key...
Wikipedia is a huge global repository of human knowledge that can be leveraged to investigate interw...
Wikipedia is a huge global repository of human knowledge that can be leveraged to investi-gate inter...
Wikipedia is a huge global repository of human knowledge, that can be leveraged to investigate inter...
peer reviewedExtracting biographical information from online documents is a popular research topic a...
<div><p>Wikipedia is a huge global repository of human knowledge that can be leveraged to investigat...
Diving into a first dataset A major objective of the ENP-China project is to bring together expert...
This paper explores automatic methods to identify relevant biography candidates in large databases, ...
International audienceWe add to the literature on notable individuals (famous, prominent, distinguis...
With the rise of digital humanities, historians explore how to intellectually engage with textual so...
In the last few months we tried to build a corpus based on the biographies of the Chinese Wikipedia....
In the previous blog post, we described how we retrieved Chinese biographies by the use of a machine...
Difangzhi (地方志) is a large collection of local gazetteers complied by local govern-ments of China, a...
It is arguable whether history is made by great men and women or vice versa, but undoubtably social ...
This paper presents the idea and our work of extracting and reassembling a genealogical network auto...
International audienceGenerating factual, long-form text such as Wikipedia articles raises three key...
Wikipedia is a huge global repository of human knowledge that can be leveraged to investigate interw...
Wikipedia is a huge global repository of human knowledge that can be leveraged to investi-gate inter...
Wikipedia is a huge global repository of human knowledge, that can be leveraged to investigate inter...
peer reviewedExtracting biographical information from online documents is a popular research topic a...
<div><p>Wikipedia is a huge global repository of human knowledge that can be leveraged to investigat...
Diving into a first dataset A major objective of the ENP-China project is to bring together expert...
This paper explores automatic methods to identify relevant biography candidates in large databases, ...
International audienceWe add to the literature on notable individuals (famous, prominent, distinguis...