GitHub - ua-datalab/NLP-Speech: The repository for U of A Datalab’s “NLP for All” workshop series, where we cover the basics of Natural Language Processing (NLP) and its practical applications for everyday tasks.

Image source: Jeevan chavan's article "NLP: Tokenization , Stemming , Lemmatization , Bag of Words ,TF-IDF , POS"

Natural Language Processing for All

Join us for an engaging and accessible introduction to Natural Language Processing (NLP) and its practical applications for everyday tasks! In "NLP for All," we will explore the fundamental concepts behind NLP: From understanding how computers interpret human language; to discovering how to improve search queries, use regular expressions, find datasets, and learn about pipelines for working with language. Whether you're curious about chatbots, voice assistants, or automated text transcription and analysis, this series will demystify popular technologies and show you how they work.

REGISTER WITH THIS LINK

ACCESS THE DATALAB CALENDAR

What We Will Cover:

Foundations of NLP: Gain a solid grasp of NLP concepts and terminology without needing a technical background.
Real-World Applications: Explore practical uses of NLP in various contexts, such as improving search and information retrieval, generating and evaluating automatic transcriptions, and working with popular libraries such as spaCy, PyTorch and scikit-learn.
Hands-On Experience: We will illustrate NLP concepts in action with a well-documented code notebook, aimed at solving practical examples. We will also explore online sources for NLP tools and datasets, such as HuggingFace.

Pre-requisites:

A Google account to run Google Colab (where we will do most of our programming exercises)
Basic knowledge of Python. You can brush up python fundamentals with Software Carpentry's Introduction to Python (section 1)

Coordinator: Megh Krishnaswamy.
Location: Albert B. Weaver Science-Engineering Library. Room 212.
When: Thursdays at 3PM.

Calendar:

Date	Title	Topic Description	Materials
09/05/2024 3PM	Introduction to NLP with SpaCy	Join us for an informative session on the basics of Natural Language Processing (NLP) with spaCy, a leading open-source library for advanced text processing in Python. Designed for production use and capable of handling large volumes of text efficiently, spaCy offers a Swiss-knife approach to text processing across multiple languages. In this workshop, we will include tools for key NLP tasks such as tokenization, part-of-speech tagging, named entity recognition, dependency parsing, text similarity calculation.	Link to Notebook
09/12/2024 3PM	Regular Expressions for NLP	Regular Expressions (Regex) is an essential skill for advanced search, and analysis. Join us for a comprehensive introduction to the basics of building regular expressions, with a focus on creating and applying patterns to extract, clean, and transform text data effectively. We will explore practical NLP use cases, such as extracting specific information from unstructured data, performing search-and-replace operations, and validating text inputs. Our materials will include resources for getting started with Regex syntax, as well as practical code examples deploying Regex searches on your desktop, using browser tools, as well as Python and R libraries. Join us for a practical demonstration of how to get the best out of your text searches with Regex!	Link to notebook
09/19/2024 3PM	NLP with Transformers	In this workshop, we will introduce foundational concepts of the transformer architecture. We will also look at some use-cases for using pre-trained models that fit our use case, using popular Python frameworks like TensorFlow and PyTorch. Join us for an informative session on the technology that built Large Language Models, and what it can do to enhance your skills as a researcher!	Link to notebook Link to Slides
09/26/2024 3PM	Introduction to Semantic Search	Dive into the world of semantic search with this workshop, where we will explore NLP powered options to enhance text search. Unlike the traditional keyword-based search, semantic search understands the meaning and context behind queries, providing more relevant and contextualized results. This workshop covers the fundamentals of semantic search technologies, with introductions to vector representations, embeddings, and advanced search algorithms. In this workshop, we will explore how to implement semantic search with a simple real-world use-case. We will also learn how to find, choose and use pre-trained models and datasets for our tasks. Join us to learn more about how meaning and context can help you get the best out of your search experience!	Link to Notebook Link to optional notebook
10/03/2024 3PM	Introduction to Information Extraction	Join us for an introductory session on Information Extraction (IE)! Designed with a focus on automatic extraction of structured information from unstructured text, we will explore why information extraction is a key skill for a variety of research tasks. IE is a critical component of many NLP applications, from data mining to knowledge graph construction. In this workshop, we will cover from fundamentals of information extraction, such as named entity recognition, relationship extraction, and event detection. We will look at various algorithms and tools used in IE. This workshop will provide hands-on experience with a simple project that demonstrates how to extract valuable insights from large text corpora, implemented using Python. Enhance your abilities to automate information extraction, to transform raw text into meaningful data!	Link to Notebook
10/10/2024 3PM	Text pre-processing for NLP	Prepare your text data for advanced analysis with our primer on text pre-processing for Natural Language Processing. Text pre-processing is a crucial step in any NLP pipeline, ensuring that your data is clean, normalized, and ready for modeling. This workshop will introduce pre-processing techniques for text data from sources such as web scraping and online datasets. We will take a look at tools available for categorising, organizing and tagging our text. With a practical demonstration, we will explore handling various text formats, dealing with noise, and transforming text into a format suitable for machine learning algorithms. Whether you are interested in an NLP task or just making sense of a data dump, join us for this session on the tools and knowledge to optimize your text data effectively!	Link to Notebook
10/17/2024 3PM	Introduction to Speech Technology	Explore the field of Speech Technology with this introductory workshop, designed to improve your knowledge of the principles and applications of speech processing. This is a beginner-friendly, hands-on workshop that covers the basics of acoustic modeling, phonetics, and a brief look at the applications of speech technology in modern applications. We will discuss real-world applications such as automatic transcription, speech recognition, text-to-speech synthesis, and speaker identification, and take a look at existing tools and techniques for building simple speech-powered tools.	Link to Notebook
10/24/2024 3PM	Speech-to-Text with Whisper AI	Whisper AI, known for its high accuracy and efficiency, is transforming the way we convert spoken language into written text. This workshop provides an overview of Whisper AI's architecture and features, and covers the process of building, training, and deploying speech-to-text models. We will explore real-world applications such as automatic transcription, and look at ways to effectively evaluate our output (such as WER scores) . With a practical coding examples, we will cover handling speech data in various languages, to achieve high-quality transcription, and explore ways of creating pipelines in Python to save and process our outputs.	Link to Notebook

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
Intro_to_Semantic_Search		Intro_to_Semantic_Search
Introduction_to_Information_Extraction		Introduction_to_Information_Extraction
Introduction_to_Regular_Expressions		Introduction_to_Regular_Expressions
Introduction_to_Speech_Technology		Introduction_to_Speech_Technology
Introduction_to_transformers		Introduction_to_transformers
Natural_Language_Processing_with_Spacy		Natural_Language_Processing_with_Spacy
Speech_to_Text_with_Whisper		Speech_to_Text_with_Whisper
Text_pre_processing_for_NLP		Text_pre_processing_for_NLP
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Natural Language Processing for All

REGISTER WITH THIS LINK

ACCESS THE DATALAB CALENDAR

What We Will Cover:

Pre-requisites:

About

Releases

Packages

Contributors 3

Languages

License

ua-datalab/NLP-Speech

Folders and files

Latest commit

History

Repository files navigation

Natural Language Processing for All

REGISTER WITH THIS LINK

ACCESS THE DATALAB CALENDAR

What We Will Cover:

Pre-requisites:

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages