According to the UiPath documentation, the recommended number of documents per vendor to train the initial dataset is 10. This means that for each vendor that provides a specific type of document, such as invoices or receipts, you should have at least 10 samples of their documents in your training dataset. This helps to ensure that the dataset is balanced and representative of the real-world data, and that the machine learning model can learn from the variations and features of each vendor’s documents. Having too few documents per vendor can lead to poor model performance and accuracy, while having too many documents from a single vendor can cause overfitting and bias1.
[References: 1: Document Understanding - Training High Performing Models, , ]
Contribute your Thoughts:
Chosen Answer:
This is a voting comment (?). You can switch to a simple comment. It is better to Upvote an existing comment if you don't have anything to add.
Submit