Frequently Asked Questions
What input formats ?
We currently support CSV, XLS and XLSX. In the case of xls and xlsx, only the first tab will be parsed by the machine. It is expected that the file contain only data (no row or cell with unrelated commentary) and that the first row contain the column names.
CSV: How should it be encoded ?
We try to guess your encoding but we would prefer if you uploaded your files in UTF-8 (for the sake of humanity; click here for an actual reason)... If that's really not possible, we also accept ISO-8859-1 and windows-1252.
CSV: What separator ?
We accept comma (","), semicolons (";") and tabs (" ")
How big can my files be ?
CSV uploads are currently limited to 2M rows and Excel files to 500K rows. You can include up to 100 columns, but we suggest you remove any unecessary columns before uploading; this will make your upload and processing faster and avoid using up our storage space :)
Should I re-upload a file if I need to link it twice ?
No! Save some time and choose it from your previously uploaded files.
What should I know about my file before I start linking ?
We expect that you know the content and columns of the source and reference file; at least enough so you can decide what are good columns to match on.
How do I choose what columns to match on ?
Basically, you should give the machine the minimal sufficient amount of information so that a human with no knowledge of the field be able to decide if a match is valid.
How long will it take ?
Once you are familiar with the Merge Machine, the column selection and labelling process might take 5 or 10 minutes of your time. The processing time, on the other hand, (indexing and linking, in particular) will vary with the size of your file and the affluence on our website. The indexing and the linking of a 1000 row file might take around 30 seconds each (time should be proportional to the number of rows).
I don't want to pay if it doesn't work !
We agree! By the end of the labelling phase, you should have an idea of how well the matching is doing. If you are not satisfied with the results you are getting, don't continue with linking and you will keep your credits :)
What is the output file like ?
The output table is a CSV file with the same number of rows as your source table (dirty) and has additional columns for the data that was found in the reference (clean).I'm still not sure how this works...
Don't worry, each page of the linking process has a "How To" button with detailed information on how to use the page.