About the corpus

    The NCCU Corpus of Spoken Taiwan Mandarin, formerly the NCCU Corpus of Spoken Mandarin, is a project of language documentation whereby open access to the data is available at no charge for research and teaching purposes. It has been collecting spoken data from daily face-to-face Mandarin conversations in Taiwan since 2006.  Written consent was obtained from the participants for the publication of the spoken data. A broad transcription of speech is applied with essential interactional features such as turn transition, overlaps and code-switching. The spoken data are checked and revised from time to time for completeness and consistency.


    Part of the corpus data are also available at TalkBank which aims to foster fundamental research in the study of human communication.


    Fundings for this language documentation project:

    • The Aim for the Top University and Elite Research Center Development Plan, National Chengchi University (2006 – 2008)
    • The Humanities Research Center of the National Science Council (2006, 2008)
    • The Office of Research and Development, National Chengchi University (2008)
    • Research projects, the Ministry of Science and Technology (2009 – present)

    © NCCU Corpus of Spoken Taiwan Mandarin
    No. 64, Sec. 2, ZhiNan Rd., Wenshan District, Taipei City 11605, Taiwan R.O.C.
    +886‐2‐29393091 ext.63912
    Designed by 夏木樂網頁設計 

    NCCU1 d847d