CS3TM: Text Mining and Natural Language Processing

��̽��

CS3TM: Text Mining and Natural Language Processing

Module code: CS3TM

Module provider: Computer Science; School of Mathematical, Physical and Computational Sciences

Credits: 20

Level: 6

When you’ll be taught: Semester 2

Module convenor: Professor Xia Hong, email: x.hong@reading.ac.uk

Pre-requisite module(s): BEFORE TAKING THIS MODULE YOU MUST TAKE CS2PP OR TAKE CS2PP22 OR TAKE CS2PP22NU OR TAKE CS2PJ20 (Compulsory)

Co-requisite module(s):

Pre-requisite or Co-requisite module(s):

Module(s) excluded:

Placement information: NA

Academic year: 2025/6

Available to visiting students: Yes

Talis reading list: Yes

Last updated: 8 September 2025

Overview

Module aims and purpose

The aim of this module is to introduce the field of text mining and natural language processing. A key focus of the module is placed on the theories and practice of processing text data from the aspects of lexicons, syntactics, and semantics.  

This module also encourages students to develop a set of professional skills, such as problem solving, critical thinking, scientifical evaluation, creativity, technical report writing, organization and time management, self-reflection. 

Module learning outcomes

By the end of the module, it is expected that students will be able to:

Understand and apply the fundamental principles of text mining and natural language processing;
Apply methods and algorithms to process different types of textual data;
Empirically evaluate the performances of methods and algorithms by using accuracy and efficiency metrics; and
Apply analytical and programming skills through using the existing NLP methods and tool s such as NLTK and scikit-learn (python).
Understand ethics in NLP, in particular issues in large language models.

Module content

The module covers the following topics:

Regular expression, Text Normalization
N-gram and language model, part-of-speech tagging lexical semantics, Word Senses and WordNet Syntactic and Semantic parsing
Text classification, sentiment analysis
Information extraction including name entity recognition and relation extraction
Advanced topics: Machine learning for NLP, Word embedding, Hidden Markov model and Viterbi algorithm, , chatbots, Large Language Models, ethics in NLP

Structure

Teaching and learning methods

The lectures will introduce students the theories, concepts and underpinning principles specified in the indicative content. Students will be supervised in the practical sessions to apply the concepts and principles to given problems context for learning.
The lectures and practical sessions will enable students to practice a known NLP software, perform analysis and report writing.
There will also be learning materials in digital forms when they are required to support learning.

There are two types of assessment (i.e., formative assessment and summative assessment) which will support and reinforce students’ learning. Formative assessment is carried out through weekly learning activities either exemplar questions, or sample programmable problems.

Summative assessment consists of one piece of written coursework assignment and one written examination. The written coursework assignment requires students to demonstrate scientific writing of individual report. Appropriate feedback will be timely communicated with students for enhancing learning.

Study hours

At least 40 hours of scheduled teaching and learning activities will be delivered in person, with the remaining hours for scheduled and self-scheduled teaching and learning activities delivered either in person or online. You will receive further details about how these hours will be delivered before the start of the module.

Scheduled teaching and learning activities	Semester 1	Semester 2	��ܳ��
Lectures		20
Seminars		10
Tutorials
Project Supervision
Demonstrations
Practical classes and workshops		8
Supervised time in studio / workshop
Scheduled revision sessions		2
Feedback meetings with staff
Fieldwork
External visits
Work-based learning

Self-scheduled teaching and learning activities	Semester 1	Semester 2	��ܳ��
Directed viewing of video materials/screencasts
Participation in discussion boards/other discussions
Feedback meetings with staff
Other
Other (details)

Placement and study abroad	Semester 1	Semester 2	��ܳ��
Placement
Study abroad

Please note that the hours listed above are for guidance purposes only.

Independent study hours	Semester 1	Semester 2	��ܳ��
Independent study hours		160

Please note the independent study hours above are notional numbers of hours; each student will approach studying in different ways. We would advise you to reflect on your learning and the number of hours you are allocating to these tasks.

Semester 1 The hours in this column may include hours during the Christmas holiday period.

Semester 2 The hours in this column may include hours during the Easter holiday period.

Summer The hours in this column will take place during the summer holidays and may be at the start and/or end of the module.

Assessment

Requirements for a pass

Students need to achieve an overall module mark of 40% to pass this module.

Summative assessment

Type of assessment	Detail of assessment	% contribution towards module mark	Size of assessment	Submission date	Additional information
Online written examination	Exam	50	2 hours	Semester 2 Assessment Period	Answer 3 out of 4 questions
Set exercise	Technical report	50	12 pages (excluding appendices). 20 hours	Semester 2, Teaching Week 11

Penalties for late submission of summative assessment

The Support Centres will apply the following penalties for work submitted late:

Assessments with numerical marks

where the piece of work is submitted after the original deadline (or a DAS-agreed extension as a reasonable adjustment indicated in your Individual Learning Plan): 10% of the total marks available for that piece of work will be deducted from the mark for each calendar day (or part thereof) following the deadline up to a total of three calendar days;
where the piece of work is submitted up to three calendar days after the original deadline (or a DAS-agreed extension as a reasonable adjustment indicated in you Individual Learning Plan), the mark awarded due to the imposition of the penalty shall not fall below the threshold pass mark, namely 40% in the case of modules at Levels 4-6 (i.e. undergraduate modules for Parts 1-3) and 50% in the case of Level 7 modules offered as part of an Integrated Masters or taught postgraduate degree programme;
where the piece of work is awarded a mark below the threshold pass mark prior to any penalty being imposed, and is submitted up to three calendar days after the original deadline (or a DAS-agreed extension as a reasonable adjustment indicated in your Individual Learning Plan), no penalty shall be imposed;
where the piece of work is submitted more than three calendar days after the original deadline (or a DAS-agreed extension as a reasonable adjustment indicated in your Individual Learning Plan): a mark of zero will be recorded.

Assessments marked Pass/Fail

where the piece of work is submitted within three calendar days of the deadline (or a DAS-agreed extension as a reasonable adjustment indicated in your Individual Learning Plan): no penalty will be applied;
where the piece of work is submitted more than three calendar days after the original deadline (or a DAS-agreed extension as a reasonable adjustment indicated in your Individual Learning Plan): a grade of Fail will be awarded.

Where a piece of work is submitted late after a deadline which has been revised owing to an extension granted through the Assessment Adjustments policy and process (self-certified or otherwise), it will be subject to the maximum penalty (i.e., considered to be more than three calendar days late). This will also apply when such an extension is used in conjunction with a DAS-agreed extension as a reasonable adjustment.

The University policy statement on penalties for late submission can be found at: /cqsd/-/media/project/functions/cqsd/documents/qap/penaltiesforlatesubmission.pdf

You are strongly advised to ensure that coursework is submitted by the relevant deadline. You should note that it is advisable to submit work in an unfinished state rather than to fail to submit any work.

Formative assessment

Formative assessment is any task or activity which creates feedback (or feedforward) for you about your learning, but which does not contribute towards your overall module mark.

Each topic in a week has defined learning tasks which will enable students to self-reflect on the learning.

Outcomes of the formative assessment for each topic may be given in the guidance tutorial notes, online tests feedback.

Weekly pseudo codes and executable Python codes are given for basic algorithms.

Reassessment

Type of reassessment	Detail of reassessment	% contribution towards module mark	Size of reassessment	Submission date	Additional information
Online written examination	Exam	100	3 hours	During the University resit period	Answer 4 out of 6 questions

Additional costs

Item	Additional information	Cost
Computers and devices with a particular specification
Printing and binding
Required textbooks
Specialist clothing, footwear, or headgear
Specialist equipment or materials
Travel, accommodation, and subsistence

THE INFORMATION CONTAINED IN THIS MODULE DESCRIPTION DOES NOT FORM ANY PART OF A STUDENT’S CONTRACT.

��̽��

��̽��

Modules

Internal

��̽��

CS3TM: Text Mining and Natural Language Processing

Overview

Structure

Assessment

Reassessment

Additional costs

Things to do now

Footer navigation

����̽��

����̽��

Modules

Internal

����̽��

CS3TM: Text Mining and Natural Language Processing

Overview

Structure

Assessment

Reassessment

Additional costs

Things to do now

Footer navigation

��̽��

��̽��

��̽��