staring_at_keyboard

joined 2 years ago

SQLCoder-34b beats GPT-4 at Text-to-SQL in c/localllama@poweruser.forum

[–] staring_at_keyboard@alien.top 1 points 2 years ago

I looked at your eval framework. I have adopted a similar subset / superset result set matching approach in some of my research. One word of caution is that result set matching cannot prove semantic equivalence; so you may want to consider adding multiple database instances to reduce false positives. False positives are particularly prevalent when gold queries generate scalar values or empty result sets.

Are you planning on submitting SQLCoder-34b to other NL-to-SQL benchmarks like Spider or its other derivatives?

permalink
fedilink
source