This is a proof-of-concept Sanskrit corpus developed for the study of Buddhist Sanskrit lexicology. It comprises: 66 metadata-enriched Buddhist Sanskrit texts for a total of 2.5 million tokens a 4 million tokens reference corpus comprising 30 metadata-enriched non-Buddhist Sanskrit texts. The corpus is in romanised Sanskrit (UTF-8 encoding) and is available in two configurations: segmented (with dash-separated words) segmented and stemmed (with capitalised word stem and compounds separated by an @ sign). The latter version can be used to generate word sketches in Sketch Engine if used in conjunction with the included sketch grammar, which infers likely syntactic dependencies from morphological cues. Limitations As a proof...
One of the important features of Sanskrit language is the long tradition of lexicons. The early sour...
Because of the traditional reverence for oral composition and recitation in Sanskrit literature, mos...
This repository contains the lexicographic datasets developed for a proof of concept of a Buddhist S...
This is a proof-of-concept Sanskrit corpus developed for the study of Buddhist Sanskrit lexicology. ...
This is a proof-of-concept Sanskrit corpus developed for the study of Buddhist Sanskrit lexicology. ...
This is a Sanskrit corpus developed at the Mangalam Research Center (Berkeley, California) for the s...
The work was accepted in Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, S...
Lexical datasets containing annotated concordances of words pertaining to the conceptual domains of ...
We describe an innovative computer interface designed to assist annotators in the efficient selectio...
This folder contains R code for a rule-based Buddhist Sanskrit Segmenter and Lemmatiser, as well as ...
Sanskrit is one of the most ancient attested Indo-European languages, and it has one of the oldest l...
These data were used for the study published in: Lugli, Ligeia. 2019. Words or terms? Models of ter...
Sanskrit has a rich source of lexical resources in the form of various kinds of dictionaries, and a ...
The article describes the development of a program, written in SNOBOL4, which will scan Sanskrit ver...
This article is an edition of thirty-one Sanskrit–Tocharian bilingual fragments of the Udānavarga: t...
One of the important features of Sanskrit language is the long tradition of lexicons. The early sour...
Because of the traditional reverence for oral composition and recitation in Sanskrit literature, mos...
This repository contains the lexicographic datasets developed for a proof of concept of a Buddhist S...
This is a proof-of-concept Sanskrit corpus developed for the study of Buddhist Sanskrit lexicology. ...
This is a proof-of-concept Sanskrit corpus developed for the study of Buddhist Sanskrit lexicology. ...
This is a Sanskrit corpus developed at the Mangalam Research Center (Berkeley, California) for the s...
The work was accepted in Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, S...
Lexical datasets containing annotated concordances of words pertaining to the conceptual domains of ...
We describe an innovative computer interface designed to assist annotators in the efficient selectio...
This folder contains R code for a rule-based Buddhist Sanskrit Segmenter and Lemmatiser, as well as ...
Sanskrit is one of the most ancient attested Indo-European languages, and it has one of the oldest l...
These data were used for the study published in: Lugli, Ligeia. 2019. Words or terms? Models of ter...
Sanskrit has a rich source of lexical resources in the form of various kinds of dictionaries, and a ...
The article describes the development of a program, written in SNOBOL4, which will scan Sanskrit ver...
This article is an edition of thirty-one Sanskrit–Tocharian bilingual fragments of the Udānavarga: t...
One of the important features of Sanskrit language is the long tradition of lexicons. The early sour...
Because of the traditional reverence for oral composition and recitation in Sanskrit literature, mos...
This repository contains the lexicographic datasets developed for a proof of concept of a Buddhist S...