FF:PLIN057 Automatic processing of text - Course Information
PLIN057 Automatic processing of text
Faculty of ArtsSpring 2018
- Extent and Intensity
- 0/2/0. 4 credit(s). Type of Completion: z (credit).
- Teacher(s)
- Mgr. et Mgr. Ondřej Mrázek, Ph.D. (lecturer)
- Guaranteed by
- doc. PhDr. Zdeňka Hladká, Dr.
Department of Czech Language – Faculty of Arts
Contact Person: Jaroslava Vybíralová
Supplier department: Department of Czech Language – Faculty of Arts - Timetable
- Mon 10:50–12:25 G13
- Prerequisites
- None.
- Course Enrolment Limitations
- The course is also offered to the students of the fields other than those the course is directly associated with.
The capacity limit for the course is 20 student(s).
Current registration and enrolment status: enrolled: 0/20, only registered: 0/20, only registered with preference (fields directly associated with the programme): 0/20 - fields of study / plans the course is directly associated with
- Czech Language and Literature (programme FF, B-FI) (2)
- Czech Language and Literature (programme FF, B-GK)
- Czech Language and Literature (programme FF, B-HS)
- Czech Language and Literature (programme FF, B-MA)
- Czech Language and Literature (programme FF, N-FI) (2)
- Czech Language and Literature (programme FF, N-HS)
- Czech Language with Orientation on Computational Linguistics (programme FF, B-FI)
- Czech Language with Orientation on Computational Linguistics (programme FF, N-FI)
- Course objectives
- It is often important in humanities to be able to transform textual data into a structured form. This ability allows for textual analysis, text information retrieval and becomes an input for further research regardless of text semantics.
The aim of the course is to teach students the basic possibilities of processing textual information using selected computer tools. The secondary aim is to teach students to perceive text as a data type that is devoid of meaning and to cope with different text encoding and text's portability between different operating systems.
The course is designed for students who have no experience with this topic.
The pace and content of the course will be tailored to the students' needs. Understanding and practice of the topics issued in the course will be preferred to quantity of topics visited. - Learning outcomes
- After the course the student will be familiar with the problems of text processing and will be able to:
- search the text
- transform the text into a different form
- compare texts to each other
- compile simple databases from the information obtained. The student will also be capable of:
- using regular expressions and implement them
- basic work in the Linux terminal
- using UNIX text tools (grep, sort, uniq, cut etc.)
- using UNIX text editors (nano, sed, vim).
Given students' capabilities and interest also:- basics of scripting in Bash
- basic text processing in Python language.
- Syllabus
- Getting familiar with the course of the semester
- Regular expressions and their use
- Getting to know the UNIX terminal
- Data Flow Management (Input, Output, Redirect)
- cat, tac, head, tail, wc,
- grep, sort, uniq, cut
- comm, diff, join, paste, csplit
- tr, nano, sed
- vim
- Basics of Scripting in Bashi
- Practice
- Work with text in Python
- Literature
- recommended literature
- Manuálové stránky jednotlivých utilit.
- BRANDEJS, Michal. UNIX - Linux : praktický průvodce. 1. vyd. Praha: Grada, 1996, 340 s. ISBN 8071691704. info
- Teaching methods
- teaching, practicing, discussion
- Assessment methods
- The credit will be awarded for attendance, active participation and passing the test.
- Language of instruction
- Czech
- Further Comments
- Study Materials
- Enrolment Statistics (Spring 2018, recent)
- Permalink: https://is.muni.cz/course/phil/spring2018/PLIN057