Parsee Launch
Why Parsee?
Parsee aims to be a simple, opinionated framework for easily structuring data from the most common sources of unstructured data. These are in our opinion: pdfs, HTML files and images.
Parsee is NOT aimed to be used in a „chat bot“ format. For fully structuring data we require the output to be most concise and type safe (I.e. we want to make sure the output is just a number or enum). For chat bot style usage there are already many frameworks available. That’s also why in this first release at least, we have not prioritized the integration with popular frameworks such as Langchain. We might add support for these later if requested by enough users.
If you are interested in running multiple extraction jobs in parallel in the cloud, you can sign up for Parsee Cloud: app.parsee.ai
In Parsee Cloud you can also find pre-defined extraction templates from the community or share your own.
Parsee can be both used with LLMs and other model architectures. The latter require the presence of a dataset, which you can also create in the python api and on parsee cloud. If you don’t have a dataset yet, you can always use a range of LLMs, which can already perform most extraction tasks fairly well (see the examples).
- GPT-4o Benchmark Results Showing that it is Truly a Next-Generation ModelWe tested the performance of GPT-4 Omni (model name: gpt-4o) on our finRAG dataset, the results show that this is truly a next generation model that does not seem to have some common issues that previous generation models had, making it possibly the first model suitable for reliable enterprise use.
- Data ExtractionfinRAG Datasets & StudyWe wanted to investigate how good the current state of the art (M)LLMs are at solving the relatively simple problem of extracting revenue figures from publicly available financial reports. To test this, we created 3 different datasets, all based on the same selection of 1,156 randomly selected annual reports for the year 2023 of publicly listed US companies. The resulting datasets contain a combined total of 10,404 rows, 37,536,847 tokens and 1,156 images. For our study, we are evaluating 8 state-of-the-art (M)LLMs on a subset of 100 reports.