Open-ended coding - smartinterview to extract verbatim

Written by

Matthieu SAUSSAYE

Published

Feb 18, 2026

Open-ended codification

SmartInterview's Pulse Classifier is an AI-powered tool that automates the codification of open-ended survey responses. It adapts to Excel files (.xlsx, .xls), SPSS format (.sav), and in-platform SmartInterview surveys, letting you classify thousands of verbatim responses in minutes instead of hours.

It preserves your original data intact and appends structured classification columns (codes and sentiment), making it immediately compatible with your existing analysis workflows.

Key capabilities include:

  • Automatic code generation from respondent answers

  • Predefined codelist import from your Excel Topics sheet

  • Sentiment analysis (Positive / Negative / Neutral) per code

  • Multi-column classification for files with several open-ended questions

  • Real-time progress tracking so you can work on other tasks while classification runs

  • Estimated code counts before running the full classification

1. Getting Started

Supported Input Formats

Format

Description

.xlsx

Microsoft Excel 2007+ (primary format)

.xls

Legacy Excel 97-2003

.sav

SPSS data files

In-platform survey

Active SmartInterview surveys (auto-imported)

To begin, navigate to the Pulse classifier page from your dashboard. You can either upload a file (drag-and-drop or click to browse) or select an active survey from your account.

When importing from an active SmartInterview survey, the configuration automatically adapts to the platform's data structure:

2. Configuration

Once your file is uploaded, a configuration dialog opens:

"Configuration de la classification" — Configure the classification parameters before launching the process.

Sheet Selection

Use the "Feuille avec les données" dropdown to select the sheet containing your respondent data. SmartInterview shows an Excel preview (5 rows) so you can verify you've selected the correct sheet.

If your file has no header row, click "+ Pas d'en-tete" to tell the system that the first row is data, not column names.

Column Mapping

Under "Selection des colonnes", map two required columns:

  • Colonne Respondent ID (blue indicator) — The unique identifier for each respondent (e.g., Respondent_IDuser_idRespondent_Serial)

  • Colonne Reponses (purple indicator) — The column containing open-ended answers to classify (e.g., Q1Q2question_id)

SmartInterview auto-detects common column names, but you can always override the selection using the dropdowns.

Topics Sheet

Under "Feuille avec les topics", select the sheet containing your predefined codelist. If your Excel file includes a sheet named Topics with two columns (Valeur and Libelle), it will be detected automatically.

Click "Charger les colonnes" to load and preview the topics from the selected sheet. The system displays how many topics were detected (e.g., "45 topics detectes dans la feuille Topics").

If no topics sheet exists, select "Aucune (detection automatique)" and SmartInterview will generate codes for you automatically (see Code Generation).

3. Code Generation

SmartInterview offers two approaches for defining your codelist:

A. Import Existing Codes

If your Excel file already contains a Topics sheet with a predefined codelist, SmartInterview reads it directly. The expected format is:

Valeur

Libelle

1

Qualite du produit / Produktqualitat / Product quality

2

Service client / Kundendienst / Customer service

3

Rapport qualite-prix / Preis-Leistung / Value for money

4

Facilite d'utilisation / Benutzerfreundlichkeit / Ease of use

5

Livraison / Lieferung / Delivery

...

...

SmartInterview supports multilingual topic labels separated by / (e.g., Qualite du produit / Produktqualitat / Product quality), enabling cross-language matching. A respondent answering "Die Qualitat ist hervorragend" in German will be correctly matched to a topic originally labeled in French as "Qualite du produit".

B. Auto-Generate Codes

When no codelist is available, click "Generer le plan de code". The AI samples your responses and infers the most representative topics. You can then review and edit them before launching the classification.

Once generated, you enter the Topic Editor where you can:

  • Rename any topic by clicking on its label

  • Reorder topics by drag-and-drop (grip handle on the left)

  • Delete individual topics using the trash icon

  • Add new topics with the "+ Ajouter un topic" button

  • Regenerate the entire codelist if needed

4. Code Counting & Estimation

Before running the full classification, SmartInterview provides approximate code counts — an estimate of how many respondents will be assigned to each topic.

These estimates appear as colored badges next to each topic:

  • Green numbers indicate topics with meaningful response volume

  • Red/low numbers highlight topics that may be underrepresented

A notice reminds you: "Estimations approximatives: classifiez pour obtenir les valeurs précises."

This preview helps you refine your codelist before committing to the full classification: merge underperforming topics, split broad ones, or remove irrelevant codes.

5. Code Deletion & Editing

The topic editor gives you full control over your codelist:

  • Delete a topic: Click the trash icon next to any topic. The numbering automatically adjusts.

  • Rename a topic: Click on the label text and edit it inline.

  • Reorder topics: Drag the grip handle to change the rank order.

  • Add a topic: Use the "+ Ajouter un topic" button at the bottom of the list.

All changes are reflected instantly in the editor. The final codelist is what will be used during classification and exported in the output file.

6. Multiple Open-Ended Questions

If your file contains several open-ended columns (e.g., Q1_1Q2_1Q3_1), you can classify them all in a single operation.

How It Works

  1. Configure the first column as described above

  1. Click "Colonne suivante" to add another column

  2. Column tabs appear at the top of the dialog (e.g., Q1_1Q2_1)

  3. Configure each column independently: select the response column, topics sheet, and settings

  4. Click "Lancer X classifications" to start all columns at once

Each column can have its own topics sheet and settings. A green checkmark appears on completed column tabs.

SmartInterview processes each column as a separate classification job, running them concurrently. You can track the progress of each one individually.

7. Tolerance Level

The "Seuil de tolerance" slider (range: 1 to 5) controls how aggressively the AI assigns codes to responses.

Level

Behavior

1

Conservative: Assigns fewer codes per response. Only high-confidence matches.

3

Balanced (default): Good trade-off between precision and recall.

5

Permissive: Assigns more codes per response. Captures weaker associations.

Increasing the tolerance increases the number of codes attributed per response. A higher tolerance is useful when respondents give long, multi-topic answers and you want to capture every nuance. A lower tolerance is better for short answers or when precision matters more than coverage.

8. Sentiment Analysis

SmartInterview automatically performs sentiment analysis alongside topic classification. For each code assigned to a response, the AI determines whether the respondent's tone is:

  • Positive

  • Negative

  • Neutral

Sentiment results are added as dedicated columns in the output file (see Excel Output Structure), making it easy to cross-tabulate topics by sentiment in your analysis tool.

Special cases like "Don't know" or "Other" are always classified as Neutral.

9. Running Classifications & Waiting Time

Once you click "Confirmer et lancer la classification" (or "Confirmer et classifier" from the topic editor), the classification begins processing in the background.

Background Processing

Classifications run in your session — you can navigate to other pages, work on other surveys, or configure additional classifications while the process runs. A floating badge at the bottom of the screen reminds you:

Click the badge to open the Classifications drawer, which shows real-time progress for all active jobs:

For each job, you can see:

  • File name and column being classified (e.g., Survey_Raw.xlsxColonne: Q1_1)

  • Progress bar with percentage (0% to 100%)

  • Estimated time remaining (e.g., ~2m30s)

  • Cancel button (red X) to stop a running classification

For multi-column classifications, a summary header shows overall batch progress: "Multi-classification (0/2 terminees)".

Typical Processing Times

Processing time depends on the number of respondents and the tolerance level. As a general indication:

Respondents

Approximate Time

100

~2 minutes

500

~3 minutes

1,000+

~5 minutes

You do not need to keep the page open. The classification runs server-side and results will be available when you return.

10. Top Topics

Once classification is complete, the output Excel file includes a "Top Topics" sheet that ranks topics by frequency across all respondents.

Rang

Libelle

Compte

1

Service client / Kundendienst / Customer service

312

2

Qualite du produit / Produktqualitat / Product quality

287

3

Rapport qualite-prix / Preis-Leistung / Value for money

145

4

Facilite d'utilisation / Benutzerfreundlichkeit / Ease of use

98

5

Livraison / Lieferung / Delivery

73

6

Ne sait pas

42

7

Autre

18

This gives you an instant overview of the most frequently mentioned themes, sorted by count. Use this sheet to quickly identify dominant topics, spot emerging issues, and prioritize your analysis — without manually reading through hundreds of verbatims.

11. Special Cases

SmartInterview handles several edge cases automatically:

Multilingual Responses

Topic labels can include multiple language variants separated by /. For example:

Qualite du produit / Produktqualitat / Product quality / Qualita del prodotto

The AI performs cross-language semantic matching. A respondent answering "Die Lieferung war sehr schnell" in German will be correctly matched to a topic labeled "Livraison / Lieferung / Delivery". Similarly, an Italian response like "Ottimo servizio clienti" will match "Service client / Kundendienst / Customer service".

This is particularly useful in multilingual markets (e.g., Switzerland with FR/DE/IT/EN) where respondents answer in their preferred language but topics need to be consolidated into a single codelist.

12. Excel Output Structure

The classified file preserves your original data and appends new columns:

Main Data Sheet (e.g., FilesQO)

Respondent_ID

Q1_1a

Q1_1aCOMM1

Q1_1aCOMM1_SENTIMENT

Q1_1aCOMM2

Q1_1aCOMM2_SENTIMENT

1001

J'adore la qualite du produit, le service est toujours rapide et efficace

1

Positive

2

Positive

1002

Le prix est trop eleve par rapport a ce qu'on recoit, franchement decevant

3

Negative



1003

Tres facile a utiliser, l'interface est claire et intuitive

4

Positive



1004

Je ne sais pas

6

Neutral



1005

Die Lieferung war sehr schnell, aber die Verpackung war beschadigt

5

Positive

1

Negative

1006

Ottimo servizio clienti, sempre disponibili e cortesi

2

Positive



1007

Nothing special to say, it does the job

7

Neutral



  • Q1_1a is the original verbatim column (open-ended responses)

  • Q1_1aCOMM1Q1_1aCOMM2 contain the topic code numbers (matching the Valeur in the Topics sheet). The column name is derived from the response column: Q1_1a + COMM + rank.

  • Q1_1aCOMM1_SENTIMENTQ1_1aCOMM2_SENTIMENT contain the sentiment label for each code assignment

  • Multiple COMM/SENTIMENT column pairs are created when a response matches several topics

Topics Sheet

Valeur

Libelle

1

Qualite du produit / Produktqualitat / Product quality

2

Service client / Kundendienst / Customer service

3

Rapport qualite-prix / Preis-Leistung / Value for money

4

Facilite d'utilisation / Benutzerfreundlichkeit / Ease of use

5

Livraison / Lieferung / Delivery

6

Ne sait pas

7

Autre

Top Topics Sheet

Rang

Libelle

Compte

1

Service client / Kundendienst / Customer service

312

2

Qualite du produit / Produktqualitat / Product quality

287

3

Rapport qualite-prix / Preis-Leistung / Value for money

145

4

Facilite d'utilisation / Benutzerfreundlichkeit / Ease of use

98

5

Livraison / Lieferung / Delivery

73

13. Quality Assurance

SmartInterview includes several safeguards to ensure classification quality:

  • Preview before launch: The Excel preview and topic estimation let you verify your configuration before launching.

  • Cancel anytime: Running classifications can be cancelled from the progress drawer. The system stops gracefully and releases reserved resources.

  • Re-classify: If results are unsatisfactory, adjust your codelist or tolerance and re-run the classification on the same file.

  • Matching: The AI uses semantic similarity, not just keyword matching. Synonyms, abbreviations, and multilingual variants are recognized automatically.

  • Multiple parallel prompts: The tolerance setting runs multiple AI passes per response and merges the results, reducing variance and improving coverage.

14. Reported ROI

The Pulse Classifier dramatically reduces the time required for open-ended codification:

Metric

Manual Codification

SmartInterview

100 responses

1 - 2 hours

~2 minutes

500 responses

4 - 8 hours

~3 minutes

1,000 responses

1 - 2 days

~5 minutes

Codelist creation

1 - 3 hours

Automatic

Sentiment tagging

Separate pass

Included

Multi-question files

Sequential

Parallel

Beyond time savings, automated classification provides consistency — every response is evaluated against the same criteria, eliminating inter-coder variability that affects manual codification.

Next Steps

You're now ready to start classifying open-ended responses with SmartInterview.

  1. Upload your file or select an active survey

  2. Configure your columns and topics

  3. Review estimated code counts

  4. Launch the classification and let it run in the background

  5. Download your classified Excel file

If you need help or have advanced questions, reach out to us at info@smartinterview.ai.