The data used in this tutorial is available from your account once you start a new project. We invite you to try out the different steps as you read the tutorial. The tutorial is meant to be a quick of the overview of the process; additional information are available on each page on the "How To" button on top.

Tutorial

1. Upload/select your "source" and "reference" files

The source file is a the file for which you want to add information. The reference file is the file from which you want to get the information. You may select the tutorial data from the "public files" button. You can use files you previously uploaded by clicking "Use my previous files".

Upload process

2. Match columns

You select the columns that contain similar data and that can be used to match the source with the reference.

Column matching process

3. Labelling (optional)

You are asked to find a match for the row of the "source" that is shown, among a list of proabable matching rows from the "reference" file. You may add "temporary filters" to help find an individual row of the source. You may also add permanent filters to restrict the reference file for all matches.

3.1 Labelling: identifying matching rows

The first row is the data from the "source". The following rows are from the "referential". Click on a row from the "referential" to identify it as a match for the "source" row being displayed. Press "None of the above" if you find no valid result.

Labelling process

3.2 Adding temporary filters (word search)

If you are having trouble finding the match in the proposed list, you can search for specific terms in the "referential". To do so, click on any word of the source to add it as a filter.

Labelling process

3.3 Adding permanent filters

Sometimes, all the elements of the source may be contained in a subset of rows of the "referential" that can be identified by a specific value or presence of a word in a column. Sometimes, you know that the presence of a word in the referentials means there is certainly no possible match. In both cases, you can use permanent filters (include or exclude) to reduce the number of rows of the referential to search in and improve the matching speed and quality.

Labelling process

4. Manually set linking parameters (optional/advanced)

You can manually define or tweak (if you did 3.Labelling) the linking parameters. This requires a good understanding of the process and is not recommended for first time users.

5. Review results

This last page allows you to review the results of matching and correct them directly in the interface. The corrections will be reflected in the downloaded file and can also be used as additional labels for a re-run of linking.

Reading results

5.1 Correct your results

Press the thumbsup button to confirm a result is valid, or thumbsdown to mark it as wrong. Press the chevron to view more probable options and replace the match that was found by the machine.

Review process

5.2 Download your file

You may choose the output format of your file, as well as the results you want being included (confirmed matches, probable matches, all matches found...)

Download process

That's it !

If you still have doubts, don't worry, each page has a "How To" button with details on how to use it.