Aleksandar Savkov
loader
professional
bioWord Aligner
The Word Aligner is a side project that I develop to help our work for the euromatrixplus project. It is still under development, but a working version with the simplest features can be tested here.

The Word Aligner is based on a tool with the same name developed by Chris Callison-Burch for one of his projects (see Statistical Machine Translation with Word- and Sentence-Aligned Parallel Corpora and Fast, Cheap, and Creative: Evaluating Translation QualityUsing Amazon's Mechanical Turk). Currently the Word Aligner can (word-)align sentence-aligned files producing XML formatted output. I am currently working on additional features like edit, split, and merge token and also phrase alignment.

bioeuromatrixplus: bringing machine translation for european languages to the user
euromatrixplus is an ongoing machine translation project part of the seventh framework programme funded by the european commission. it aims at:
  • continuing the advance of the mt technologies, creating example systems for all european languages and providing other mt developers with infrastructure
  • continuing and broadening the controlled systematic investigation of different approaches and techniques to accelerate the scientific evolution of novel methods, including both selection and crossfertilization. the aim is to arrive at scientifically well understood novel combinations of methods that are proven superior to the state of the art
  • contribute to the growth and competitiveness of the european mt research scene and infrastructure through open evaluations and living community supported surveys of resources, tools, systems and their respective capabilities
  • focus on bringing mt to the users, in addition to focusing on scientific advances.
the bultreebank group is responsible for the creation of a bulgarian-english parallel treebank aligned according to the morphology, the parts-of-speach and the syntax.
biolanguage technologies for lifelong learning (ltfll)
the ltfll project was devised to deal with practical problems in supporting the activities of learners and tutors in educational and organizational settings (i.e., work overload and time management issues) through:
  • assessment of student contributions: in particular, giving formative feedback.
  • monitoring of study progress: ranging from dropout prevention to providing personalised advice.
  • community and group support: selecting and creating groups, ordering and archiving threads, providing overviews of the activities of a community as a whole and of the individual actors.
the bultreebank group was tasked to provide a semantic search capabilities for the project, based on ontologies and nlp analysis.
biosustainability platform for linguistic corpora and resources (splicr)
splicr was a research initiative addressing the issue of sustainability of linguistic resources. it was a cooperation between three linguistic collaborative research centres in germany, which comprise more than 40 individual research projects altogether. these projects were involved in creating manifold language resources, especially corpora, tailored to their particular needs. the aim of of splicr was to ensure an effective and sustainable access of these data by third-party researchers beyond the termination of these projects. this goal involved a number of measures, such as the definition of a common data format to completely capture the heterogeneous information encoded in the individual corpora, the development of user-friendly and sustainably usable tools for processing (e.g. querying) the data, and the specification of common inventories of metadata and terminology. moreover, the project aimed at formulating general rules of best practice for creating, accessing, and archiving linguistic resources.
school
biobulgarian automatic transliteration (bat)
bat provides back-transliteration of bulgarian texts encoded using the roman alphabet. since there is more than one mapping scheme between the roman and cyrillic alphabets, such transliteration problems are resolved using lexical resources enhanced by statistical methods. the user of bat is able to input a text encoded with any of the prominent mapping schemes or combination of them and still recieve a transliteration very near to the gold standard (human transliteration). bat integrates a semi-handcrafted hmm model in order to enhance its dictionary look-up and also to provide a back-up transliteration in case the lexical resources are not sufficient or there is ambiguity.
  • instructor: dale gerdemann
  • course: statistical natural language processing
biobulgarian transcription generator
the bulgarian transcription generator is a finite-state transducer implemented that transforms any given word in bulgarian into its standard ipa transcription. the phonological system of bulgarian allows for a very accurate rule-based transcription generation, because of its few exceptions. the transcritpion generator implements this set of rules as a finite-state transducer using the stuttgart finite state transducer (sfst) toolkit.
  • instructor: dale gerdemann
  • course: finite state methods in natural language processing
bioproject x
projectx is a small-scale student project, targeted at the creation, development and testing of a new xml-based error annotation scheme. the learner corpus used for its development and first testing is a small chunk of the noce learner corpus created and developed at the university of jaén by anna díaz-negrillo.
  • instructor: detmar meurers
  • course: exploring the automatic analysis of learner language
personal
None available just yet.
personal

I was born on 19 April 1984 in Plovdiv, Bulgaria to Dimitar Savkov and Yordanka Savkova. As a child I enjoyed hiking with my family, paper dart fights in the woods, ball games, kayaking and camping. All except fighting with paper darts remain among my favourite activities along with my new passions: mountain-biking and skiing. In my teens I became interested in computer games and later in programing and web design. I still enjoy a good friendly deathmatch now and then.

Studying computational linguistics in Germany inspired an interest in languages and language studies, too. I spent six great years in Tübingen being part of the ISCL program attending lecures by Erhard Hinrichs, Sandra Kübler, Dale Gerdemann, Frank Richter and Hubert Trokenbrott among others. I also had my share of Erasmus experince during a semester in Charles University in Prague where I had much fun experiencing the city while also peeking into another school of computational linguistics.

I currently reside in Sofia where I share an appartment with my younger brother and I work as a member of the BulTreeBank group, part of the Institute for Information and Communication Technologies (IICT)1 at the Bulgarian Acedemy of Sciences (BAS).

education
WS 2009/10University of Tübingen
Computational Linguistics, M.A.
SS 2006University of Tübingen
Computational Linguistics, B.A.
General Linguistics (minor)
career
from Apr 2010
until present
BulTreeBank Group, Linguistic Modelling Lab (LML) at the Institute for Information and Communication Technologies (IICT) at the Bulgarian Academy of Scienses (BAS)
Research assistant
Sofia, Bulgaria
Working on multiple EU-financed projects in the field of NLP
from Jan 2007
until Oct 2009
Sonderforschungsbereich 441: Linguistische Datenstrukturen2
Student-assistant
Tübingen, Germany
Hardware and Software Support
from Feb 2006
until Apr 2006
Ontotext
Intern
Sofia, Bulgaria
Developed a small ontology extension for the KIM Platform in the domain of the movie industry.
cv
contact
address
BulTreeBank Project
Linguistic Modelling Laboratory,
IICT,
Bulgarian Academy of Sciences
Acad. G.Bonchev St. 6
1113 Sofia, Bulgaria
phone
+35929796391
e-mail
savkov@bultreebank.org
social
Facebook Twitter Mendeley LinkedIn
blogs
friends
 
новини
loader
Новината беше успешно добавена!
Грешка при добавяне на новина!

 

видео клип
заглавие
адрес
Подмяната на последния видео клип беше успешна!
Грешка при подмяната на последния видео клип!

 

настоящ проект
loader
име
адрес
Подмяната на настоящия проект беше успешна!
Грешка при подмяната на сегашния проект!
Администратор
close
Изход
close
login
A Support Vector Machine
A Support Vector Machine
user name
password
loader