

Today researchers across a wide variety of fields find themselves having to analyse an increasing amount of qualitative information. The objective of this summer school therefore, is to provide participants the requisite toolkit necessary for the successful planning, conducting and subsequent statistical analysis of qualitative text. To this end, an overview of the following methodologies: qualitative analysis, quantitative content analysis and text mining, to text analysis is provided. The opening sessions focus on the fundamental role of data preparation to the analysis, before moving on to identifying themes and correlations using both the text mining and content analysis approach. The final sessions address the more advanced topics of importing and exporting data, together with document classification.
In common with TStat’s training philosophy, the summer school takes very much a hands-on approach to qualitative and quantitative text analysis. Each individual session is composed of both a theoretical component (in which the techniques and underlying principles behind them are explained), and an applied (hands-on) segment, during which participants have the opportunity to implement the techniques using real data under the watchful eye of the course tutor. Theoretical sessions are reinforced by case study examples, in which the course tutor discusses and highlights potential pitfalls and the advantages of individual techniques. The intuition behind the choice and implementation of a specific technique is of the utmost importance. In this manner, the course leader is able to bridge the “often difficult” gap between abstract theoretical methodologies, and the practical issues one encounters when conducting text analysis on real data. Throughout the course the applied sessions are carried out using Provalis Research’s QDA Miner, WordStat and SimStat text analysis software. WordStat is a flexible text analysis software, offering both text mining tools for fast extraction of themes and trends and state-of-the-art quantitative content analysis tools, which in conjunction with SimStat (Provalis Research’s statistical data analysis tool) and QDA Miner (for qualitative data analysis) offer users an extremely powerful and flexible integrated toolkit for qualitative and quantitative text analysis.
At the end of the summer school participants are expected to be in a position to autonomously implement, with the aid of the routines utilized during the sessions, the theories and methodologies discussed during the course of the week. In particular, participants should be in able to identify the type of data required for their specific research topic; evaluate which methodology is more appropriate for the analysis in hand; and finally test the appropriateness and sensitivity of their estimated model and the robustness of the results obtained.
The summer school is aimed at:
academic researchers, evaluators, policy advisers, social workers, educators and students working in economics, public health, sociology, psychology and political science;
data mining and market research analysts based in the automotive, market research, logistics or transportation, telecommunications sectors, needing to analyse comments from surveys, blogs, websites, social media platforms and other textual format sources;
insurance analysts looking to analyse and categorize claims from customers;
researchers based in pharmaceutical companies and medical research laboratories required to analyse healthcare reports, notes from medical doctors, interviews and/or focus groups with patients.
SETTING THE SCENE
SESSION I: THREE APPROACHES TO TEXT ANALYSIS
Qualitative Analysis
Quantitative Content Analysis
Text Mining
SESSION II: QDA MINER AND WORDSTAT – A BRIEF OVERVIEW
QDA MINER
Introduction and project management
Codebook management and manual coding
Security features and text retrieval tools
Coding Frequency and Retrieval
Code co-occurrence and case similarity analysis
Assessing relationship between coding and variables
Using the Report Manager and the Command Log
Performing teamwork
Miscellaneous Functions
WORDSTAT
Content Analysis or Text Mining
Analyzing words without dictionaries – a text mining approach
Content Analysis – Principles of dictionary construction
Importing and exporting data
Introduction to automatic document classification
QDA MINER
SESSION I: INTRODUCTION AND PROJECT MANAGEMENT
Introduction to CAQDAS using QDA Miner
The CASE x VARIABLE file structure
The Mixed-Method approach
Quick overview of the work environment
The four windows – CASE, VARIABLES, CODES, and DOCUMENT
The menu system
Creating of a new project
Creating a new project from a list of documents
Creating a new project from an existing data file
Creating an empty project / defining structure
Using the document conversion wizard
Customizing and personalizing the project
The PROJECT | PROPERTIES dialog
The PROJECT | NOTES command
Manipulating variables
Adding a variable
Deleting a variable
Changing the variable data type
Recoding the values of a variable
Reordering variables
Changing variable properties
Manipulating cases
Add a new case
Deleting cases
Importing new documents in new cases
Changing the case grouping and description
SESSION II:CODEBOOK MANAGEMENT AND MANUAL CODING
Creating codes and managing the codebook
Creating codes and categories
Modifying an existing code
Delete existing codes
Moving codes in the codebook
Merging codes in the codebook
Splitting codes in the codebook
Importing an existing codebook
Manual coding of documents (versus autocoding)
The four basic methods for assigning codes to text segments:
Highlight text segment then drag a code
Highlight text segment then double-click a code
Highlight text segment then select code and button (toolbar)
Drag and drop a code over a paragraph (or a sentence – press ALT)
Assignment of multiple codes to the same segment (press CTRL)
Modifying existing coding
Working with code marks
Viewing coding information
Adding a comment to a coding
Remove a coding
Change the code assigned to a text segment
Resizing a segment
Consolidating codes
Searching and replacing codes
Hiding code marks
Highlighting coded segments
SESSION III: SECURITY FEATURES AND TEXT RETRIEVAL TOOLS
Using backup features
Creating a permanent backup
Restoring a backup
Using the temporary session backup
Text retrieval tools (4)
Searching for text
Performing a simple text search
Performing a complex text search (using Boolean and wildcard
Performing a thesaurus search
Using the “search hits” table
Performing manual coding and autocoding
Saving to disk or printing the table
Retrieving sections in structured documents
Performing a query by example
Finding text similar to a sample text segment
Providing relevance feedback to improve search results
Finding text similar to specific coded segments
Performing a “fuzzy string matching”
Performing a keyword search
Assigning keywords to codes
Performing a keyword retrieval on internal codes
Performing a keyword retrieval on WordStat dictionary files
SESSION IV: CODING FREQUENCY AND RETRIEVAL
Coding frequency
Creating a frequency list of all codes
Creating a barchart or a pie chart on selected codes
Customizing the chart
Coding Retrieval
Performing a simple coding retrieval
Performing a complex search
Creating a text report
Creating a new project from
A shortcut for simple coding retrieval
Saving and Retrieving Queries
Retrieving a list of comments
SESSION V: CODE CO-OCCURRENCE AND CASE SIMILARITY ANALYSIS
Analyzing codes co-occurrences
Hierarchical clustering of codes
2D and 3D multidimensional scaling plots
Using the Proximity plots
Assessing similarity of cases
Analyzing code sequences
Choosing codes and setting minimum / maximum distances
Using the Sequence matrix
Searching and coding specific sequences
SESSION VI: ASSESSING RELATIONSHIP BETWEEN CODING AND VARIABLES
Analyzing coding by variables
Crosstabulating coding frequency by variables
Setting the content and format of the table
Computing correlation or comparison statistics
Comparing frequencies using barcharts or line charts
Creating and interpreting 2D and 3D correspondence plots
Creating and interpreting heatmaps
A quick overview of graphic coding features
SESSION VII: USING THE REPORT MANAGER AND THE COMMAND LOG
Using the Report Manager
Accessing the Report Manager
The Report Manager interface
Appending tables, graphics and quotes
Moving and organizing items using the table of content
Editing existing items / adding comments
Adding empty documents or folders and deleting existing items
Importing documents, images or tables
Searching and replacing text
Exporting results to HTML, Word or RTF files
Using the Command Log
Introduction to the command log – Filtering log entries
Adding comments to log entries
Undoing previously performed operations
Repeating previously performed operations
Exporting the log table to disk
SESSION VIII: PERFORMING TEAMWORK
Preparing projects for teamwork
Creating user accounts and setting privileges
Creating new accounts
Defining users access rights
Forcing users to log in
Creating duplicate copies of a project
Sending a project by email
Merging projects and assessing coding reliability
Merging two or more projects
Planning teamwork for assessing coding agreement
Adjusting colors of code marks
Computing coding agreement
The codebook and segmentation problems
Four levels of agreement
Presence or absence (0 or 1)
Frequency (0, 1, 2, etc.)
Coding importance (% of words)
Coding overlap
Correcting (or not) for chance agreement
Identifying disagreements
WORDSTAT
SESSION IX: BASIC WORD STATISTICS AND TEXT MINING
Content Analysis or Text Mining
Running WordStat from QDA Miner or Simstat
Analyzing words without dictionaries – a text mining approach
Data preparation – misspelling and control characters
Basic word frequency analysis
Application of text pre-processing methods
Exclusion list – use with care
Lemmatization and stemming – limits and benefits
Setting upper and lower frequency criteria
A few additional options
Numeric and other non-alphabetic characters Braces and square brackets
Random sampling
Using disk or memory as the working space
Identifying themes using word co-occurrence analysis
Clustering words and measuring their proximity
Clustering documents based on the words they contains
Correlation and comparison analysis based on word usage
Performing crosstabs and computing statistics
Comparing words among the sources (document or text variables)
Correspondence analysis and heatmaps
SESSION X: CONTENT ANALYSIS PRINCIPLES OF DICTIONARY CONSTRUCTION
Introduction to WordStat categorization dictionary
Dictionary structure and functions
Opening, saving, and creating categorization dictionaries
Creating manually categories of words and phrases
Principles of dictionary construction – Extracting features
Identification of technical terms and proper names (persons, places, products)
Identification of common misspellings
Extracting phrases
Creating an initial dictionary – Phrases technical terms and proper nouns words
Adding words manually
Adding words from tables Using the drag and drop editor
Organizing the dictionary (drag and drop)
Applying the dictionary
Setting different levels
Mixing dictionaries with words
Validating the dictionary
Finding words or phrases with improper meanings using the KWIC list
WordStat evaluation order – how to use this at your advantage
Disambiguation methods
Manual disambiguation Disambiguation using phrases Disambiguation using rules
Improving categorization dictionaries
Creating comprehensive dictionaries using the Suggest button.
Assessing coverage using the keyword retrieval feature
SESSION XI: ADVANCED FEATURES
Importing and exporting data
Exportation of frequency data
We are currently putting the finishing touches to our 2023 training calendar. We therefore ask that you re-visit our website periodically or contact us at formazione@tstat.it should the dates for the course which you are interested in following not yet be published. You will then be contacted via email as soon as the dates are available.
Today researchers across a wide variety of fields find themselves having to analyse an increasing amount of qualitative information. The objective of this summer school therefore, is to provide participants the requisite toolkit necessary for the successful planning, conducting and subsequent statistical analysis of qualitative text. To this end, an overview of the following methodologies: qualitative analysis, quantitative content analysis and text mining, to text analysis is provided. The opening sessions focus on the fundamental role of data preparation to the analysis, before moving on to identifying themes and correlations using both the text mining and content analysis approach. The final sessions address the more advanced topics of importing and exporting data, together with document classification.
-
Prosuite
ProSuite is a collection of Provalis Research’s integrated text analytics tools
-
Simstat 2.6
Simstat goes beyond mere statistical analysis. It offers output management features not found in any other ...
-
Qda Miner 2023
QDA Miner is an easy-to-use qualitative data analysis software for organizing, coding, annotating, retrieving, and analyzing ...