Attila Novák

Personal

E-mail Send me
About Computational linguist, expert on computational morphology.

EDUCATION

2012 – 2015 Pázmány Péter Catholic University, Budapest
Roska Tamás Doctoral School of Sciences and Technology PhD program
Field of research: Computational morphologies for Hungarian and other Uralic languages
1995 – 1999 Eötvös Loránd University, Budapest
Faculty of Arts, Theoretical Linguistics major
1992 – 1993 Friedrich-Schiller-Universität Jena, Germany
1989 – 1994 Budapest University of Technology
Faculty of Electrical Engineering and Computer Science, Computer Science major
July – August 1989 Choate Rosemary Hall, Wallingford CT, USA
September 1985 – June 1989 Táncsics Mihály Grammar School, Budapest

PROFESSIONAL ACTIVITIES

2012 – Researcher in the MTA-PPKE Hungarian Language Technology research group; creation of a psychologically motivated computational model of syntactic analysis for Hungarian.
2001 – 2011 Software developer and computational linguist at MorphoLogic; development of computational morphologies for Hungarian, German, French, Spanish, and some small Uralic languages: Komi, Umurt, Mari, Nganasan, Tundra Nenets, Mansi and Khanti; creation, conversion and quality control of dictionary databases; hybrid and statistical machine translation; creation of a knowledge portal of e administration; design and implementation of various software tools and algorithms related to natural language processing.
1998 – 2001 Young researcher at the Research Institute for Linguistics of the Hungarian Academy of Sciences; creation of a constraint-based formalism for computational morphology, the tools implementing it and a Spanish and a Hungarian morphological analyzer.
1995 – 1999 Participation in various projects for MorphoLogic and the Research Institute for Linguistics of the Hungarian Academy of Sciences: statistics-based suggestion algorithm for the spell checker of MorphoLogic; a finite-state Polish morphological database; a German computational morphology; a syntactic parser.
1994 – Novati Kft., founder, CEO, software engineer; development of special tools and applications in the field of language technology, and of language resources; outsourced editorial work for the publication of reference books and general and professional dictionaries.

TEACHING EXPERIENCE

1998 – 2003 Eötvös Loránd University
Theoretical Linguistics Program: Logic
Computational linguistics and Introduction to linguistics

RESARCH AND PROJECT PARTICIPTION

2012 – 2013 Creation of adapted morphological databases of Dutch, Italian and Russian for on-line and pop-up dictionaries.
2011 Ob-Ugric languages: conceptual structures, lexicon, constructions, categories. An innovative approach to creating descriptive resources for Khanty and Mansi (08-EuroBABEL-OP-015) – creation of the electronic on-line version of Munkácsi Bernát–Kálmán Béla(1984) Wogulisch-es Wörterbuch [Mansi Dictionary] Akadémiai kiadó. Budapest.
2010 – 2014 Morphologically annotated historical corpus of private language use (OTKA 81189) – creation of a Middle Hungarian morphological analyzer, automatic and manual annotation tools, and a corpus query system.
2010 Data and document retrieval system for the archives of the Hungarian Atomic Energy Authority Language identification, stemming and indexing of Hungarian and English documents. Extension of morphological dictionaries with nuclear terminology.
2009 – 2010 Knowledge portal of e-administration (ÁROP-2007/1.2.3-2008-0002.) – Development of an ontology of e-administration and an automatic keyword generator for a knowledge portal.
2009 – 2013 Hungarian generative historical syntax (OTKA NK 78074) – creation of an Old and Middle Hungarian morphological analyzer and automatic morphosyntactic annotation of historical texts.
2008 – 2010 Ob-Ugric morphological analyzers and corpora (OTKA NF71707) – creation of Northern Mansi, Synya and Kazim Khanty Morphological analyzers and annotated corpora.
2006 – 2009 Morphological analyzer for Nganasan (OTKA K60807) – improvement end extension of the previously created Nganasan morphological analyzer, morphosyntactic annotation of Nganasan texts.
2006 – 2009 EuroMatrix: Statistical and hybrid machine translation between all European languages (STREP FP6-34291).
2005 – 2008 Permic linguistic databases (OTKA T048309) – creation of improved and extended morpholog-ical analyzers for Komi and Udmurt.
2005 – 2007 Interactive analysis of contents of medical texts for electronic administration of medical history (AKF GVOP-311-2004-05-0363/30) –automatic morphological and partial syntactic analysis of medical text.
2004 – 2006 Development and standardization of natural language processing infrastructure (KKV GVOP-2004-333) – project manager – development and enhancement of morphological analyzers for the following languages: Hungarian, German, English, Polish, French, Spanish, Romanian, Czech, Slovak, Dutch, Italian, Croatian.
2001 – 2005 Complex Uralic Linguistic Database (NKFP 5/135/2001) – creation of morphological analyzers for Komi, Nganasan, Tundra Nenets, Udmurt, Mari, and Mansi.
1995 – 1998 GRAMLEX (Copernicus Joint Research Project 621) Creation of a syntactic parser.

PERSONAL SKILLS

Language skills Hungarian, English and German: fluent
Italian, Spanish, French: reading
Programming languages and formalisms Perl, C++, JavaScript, HTML, CSS

Publications

Novák A, Siklósi B, Oravecz C.  2016.  A New Integrated Open-source Morphological Analyzer for Hungarian. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016).
Novák A.  2016.  Új integrált magyar morfológiai elemző. XII. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2016). :78–86.
Siklósi B, Novák A.  2016.  Közeli rokonunk, az autó. XII. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2016). :27–36.
Siklósi B, Novák A.  2016.  Digitális Konzílium – egy szemészeti klinikai keresőrendszer. XII. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2016). :230–240.
Siklósi B, Novák A.  2016.  Beágyázási modellek alkalmazása lexikai kategorizációs feladatokra. XII. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2016). :3–14.
Novák A, Siklósi B.  2016.  Magyar nyelvű szövegek automatikus fonetikai átírása. XII. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2016). :134–143.
Novák A, Siklósi B.  2016.  Ékezetek automatikus helyreállítása magyar nyelvű szövegekben. XII. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2016). :49–58.
Novák A, Siklósi B.  2016.  Grapheme-to-phoneme Transcription in Hungarian. International Journal of Computational Linguistics and Applications. 7:171–193.
Novák A.  2015.  "Olcsó" morfológia. XI. Magyar Számítógépes Nyelvészeti Konferencia (MSZNY 2015). :145–157.
Siklósi B, Novák A.  2015.  Restoring the Intended Structure of Hungarian Ophthalmology Documents. Proceedings of the BioNLP 2015 Workshop on Biomedical Natural Language Processing. :152–157.
Endrédy I, Novák A.  2015.  Szótövesítők összehasonlítása és alkalmazásaik. Alkalmazott Nyelvtudomány. 15:7–27.
Novák A.  2015.  Making Morphologies the ''Easy'' Way. Computational Linguistics and Intelligent Text Processing: 16th International Conference, CICLing 2015, Cairo, Egypt, April 14-20, 2015, Proceedings, Part I. :127–138.
Novák A.  2014.  A Humor új Fo(r)mája. X. Magyar Szám{ítógépes Nyelvészeti Konferencia. :303–308.
Novák A.  2014.  A New Form of Humor – Mapping Constraint-Based Computational Morphologies to a Finite-State Representation. 9th International Conference on Language Resources and Evaluation (LREC-2014).
Orosz G, Novák A.  2014.  PurePos 2.0 egy hibrid morfológiai egyértelműsítő rendszer. IX. Magyar Szám{ítógépes Nyelvészeti Konferencia. :373-377.
Orosz G, Novák A, Prószéky G.  2014.  Lessons Learned from Tagging Clinical Hungarian. International Journal of Computational Linguistics and Applications. 5
Siklósi B, Novák A.  2014.  A magyar beteg. X. Magyar Szám{ítógépes Nyelvészeti Konferencia. :188–198.
Siklósi B, Novák A, Prószéky G.  2014.  Resolving Abbreviations in Clinical Texts Without Pre-existing Structured Resources. Fourth Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing (BioTxtM 2014).
Siklósi B, Novák A.  2014.  Identifying and Clustering Relevant Terms in Clinical Records Using Unsupervised Methods. 2nd International Conference on Statistical Language and Speech Processing.
Novák A.  2014.  Vocabulary Extension by Paradigm Prediction. PhD Proceedings Annual Issues of the Doctoral School. :145–148.
Novák A, Orosz G, Wenszky N.  2013.  Morphological annotation of Old and Middle Hungarian corpora. Proceedings of the ACL 2013 workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. :43–48.
Siklósi B, Novák A.  2013.  Detection and Expansion of Abbreviations in Hungarian Clinical Notes. MICAI 2013: 12th Mexican International Conference on Artificial Intelligence. 8265:318–328.
Siklósi B, Novák A, Prószéky G.  2013.  Context-Aware Correction of Spelling Errors in Hungarian Medical Documents. Statistical Language and Speech Processing. :248–259.
Orosz G, Novák A.  2013.  Purepos 2.0: a Hybrid Tool for Morphological Disambiguation. International conference Recent Advances In Natural Language Processing RANLP. :539-545.
Orosz G, Novák A, Prószéky G.  2013.  Hybrid Text Segmentation for Hungarian Clinical Records. Advances in Artificial Intelligence and Its Applications: 12th Mexican International Conference on Artificial Intelligence, MICAI 2013, Mexico City, Mexico, November 24-30, 2013, Proceedings, Part I. :306–317.
Orosz G, Laki LJános, Novák A, Siklósi B.  2013.  Combining Language-Independent Part-of-Speech Tagging Tools. 2nd Symposium on Languages, Applications and Technologies.
Laki LJános, Novák A, Siklósi B.  2013.  Hunglish mondattan – átrendezésalapú angol-magyar statisztikai gépifordító-rendszer. IX. Magyar Szám{ítógépes Nyelvészeti Konferencia. :71–82.
Laki LJános, Novák A, Siklósi B.  2013.  English to Hungarian Morpheme-based Statistical Machine Translation System with Reordering Rules. Proceedings of the Second ACL 2013 Workshop on Hybrid Approaches to Machine Translation (HyTra). :42-50.
Laki LJános, Novák A, Siklósi B.  2013.  Syntax Based Reordering in Phrase Based English-Hungarian Statistical Machine Translation. International Journal of Computational Linguistics and Applications. 4:63–78.
Laki LJános, Orosz G, Novák A.  2013.  HuLaPos 2.0 – Decoding morphology. MICAI 2013: 12th Mexican International Conference on Artificial Intelligence. 8265:294–305.
Wenszky N, Novák A.  2013.  The Hypercorrect Key Witness. VLlxx: Papers presented to Varga László on his 70th birthday.
Siklósi B, Orosz G, Novák A, Prószéky G.  2012.  Automatic Structuring and Correction Suggestion system for Hungarian Clinical Records. 8th {SaLTMiL} {Workshop} on {Creation} and use of basic lexical resources for less-resourced languages. :29–34.
Siklósi B, Orosz G, Novák A, Prószéky G.  2012.  Automatic structuring and correction suggestion system for Hungarian clinical records. 8th SaLTMiL Workshop on Creation and Use of Basic Lexical Resources for Less-resourced Languages. :29–34.