-
Notifications
You must be signed in to change notification settings - Fork 2.5k
feat: add BrightDataWebSearch #10285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@xoaryaa is attempting to deploy a commit to the deepset Team on Vercel. A member of the Team first needs to authorize it. |
|
@xoaryaa and @meirk-brd we can't integrate everyone's favourite search engine in haystack core, the core package will get quickly bloated. We reserve core for essential components only and we designate integrations such as this one to https://github.com/deepset-ai/haystack-core-integrations/ or even better a self-maintained component listed in our https://github.com/deepset-ai/haystack-integrations/ community project. |
|
I recommend you to self host this repo and we'll gladly add it to https://github.com/deepset-ai/haystack-integrations/ and co-promote content you publish around it. |
|
Got it, thank you very much @xoaryaa and @vblagoje , we can close this one, @xoaryaa we will create a PyPi package out of your contribution and will share the docs in : https://github.com/deepset-ai/haystack-integrations/ |
Perfect, we have a deal @meirk-brd - pick a repo, do your own dev cycles independently, the component contract will likely stay the same for a long time! Update your integration details on https://github.com/deepset-ai/haystack-integrations/ when needed and we'll approve it quickly! For a blog and other content, coordinate efforts with @bilgeyucel and we are good to go! Thanks for these contributions and looking forward to your upcoming releases. |
|
Thanks @meirk-brd @vblagoje |
Related Issues
Proposed Changes:
BrightDataWebSearchunderhaystack/components/websearch.https://api.brightdata.com/requestusingdata_format="parsed_light".page_number:page_numberto Googlestartparameter (start=(page_number-1)*10).organicresults into HaystackDocuments:content:description(fallback totitle)meta:title,link(and passes through optional fields when present, e.g.extensions,global_rank)allowed_domains.BRIGHT_DATA_API_TOKENBRIGHT_DATA_ZONEHow did you test it?
pytest test/components/websearch/test_brightdata.pyBRIGHT_DATA_API_TOKENBRIGHT_DATA_ZONENotes for the reviewer
startonly.TimeoutErroron request timeouts.BrightDataWebSearchErrorfor request failures or invalid responses.Checklist
feat:and added!in case the PR includes breaking changes.