Single comment thread
See full discussion

That is interesting Jasper. Are you feeding the whole html text to gpt4o? Are you not doing any data cleaning for html tags or similar?

I tried that first, but got the same results with sending the whole html text to gpt4o. Might optimize for speed and do that again, but I'm not sure how much that will help.

This approach doesn't work on pages where content is loaded dynamically, so for my project I will add a 'paste your own html' option or a browser plugin. (This doesn't work when scraping on a big scale, of course)

I was asking because I heard that: 1. HTML tags take up unnecessary tokens, and 2. That gpt might not work as well with all the html tags.

Home
Search
Messages
Notifications
More