kennysong

joined 2 years ago

submitted 2 years ago by kennysong@alien.top to c/machinelearning@academy.garden

0 comments fedilink

Hi! I wanted to share LangCheck, an open source toolkit to evaluate LLM applications (GitHub, Quickstart).

It already supports English and Japanese text, and more languages soon – contributions welcome!

Core functionality:

langcheck.metrics – metrics to evaluate quality & structure of LLM-generated text
langcheck.plot – interactive visualizations of text quality
langcheck.augment – text augmentations to perturb prompts, references, etc (coming soon)

Super open to feedback & curious how other people think about evaluation for LLM apps.

[–] kennysong@alien.top 1 points 2 years ago

If you're open to using an open source library, you can use LangCheck to monitor and visualize text quality metrics in production.

For example, you can compute & plot toxicity of users prompts and LLM responses from your logs. (A very simple example here.)

(Disclaimer: I'm one of the contributors of LangCheck)