Commit Graph

216 Commits

Author SHA1 Message Date
Ahmed Osman 90558ca688
FIX #2617 Cherio Web Crawler doesn't work with large sites (#2678)
* FIX #2617 Big sites scan error

* FIX #2617 Big sites scan error - review fix

---------

Co-authored-by: Ahmed Osman <ahmed.osman@evolpe.pl>
2024-07-05 11:34:47 +01:00
William Espegren cacbfa8162
feat: Add limit parameter to Spider tool (#2762)
* feat: Add limit parameter to Spider tool

* fix pnpm lint
2024-07-05 11:23:34 +01:00
William Espegren 656f6cad81
Feature/Spider (open-source web scraper & crawler) (#2738)
* Add Spider Scraper & Crawler

* fix pnpm lint

* chore: Update metadata to be correct format

* fix pnpm lint
2024-07-02 00:00:52 +01:00
Henry Heng b55f87cc40
Feature/FireCrawl (#2728)
* add firecrawl

* Update FireCrawl.ts (#2692)

---------

Co-authored-by: Eric Ciarla <43451761+ericciarla@users.noreply.github.com>
2024-06-26 14:40:43 +01:00
Mohamed Akram c34eb8ee15
Unstructured Upsert bug (#2628)
* Unstructured Upsert bug
When upserting with the API, the uploaded files are of type pdfFile, txtFile, etc.
but the code reads only fileObject which is the uploaded file using the button

* Update UnstructuredFile.ts

fixed linting error

---------

Co-authored-by: Mohamed Akram <makram@ntgclarity.com>
2024-06-14 02:39:46 +01:00
Henry Heng 8ebc4dcfd5
Feature/lang graph (#2319)
* add langgraph

* datasource: initial commit

* datasource: datasource details and chunks

* datasource: Document Store Node

* more changes

* Document Store - Base functionality

* Document Store Loader Component

* Document Store Loader Component

* before merging the modularity PR

* after merging the modularity PR

* preview mode

* initial draft PR

* fixes

* minor updates and  fixes

* preview with loader and splitter

* preview with credential

* show stored chunks

* preview update...

* edit config

* save, preview and other changes

* save, preview and other changes

* save, process and other changes

* save, process and other changes

* alpha1 - for internal testing

* rerouting urls

* bug fix on new leader create

* pagination support for chunks

* delete document store

* Update pnpm-lock.yaml

* doc store card view

* Update store files to use updated storage functions, Document Store Table View and other changes

* ui changes

* add expanded chunk dialog, improve ui

* change throw Error to InternalError

* Bug Fixes and removal of subFolder, adding of view chunks for store

* lint fixes

* merge changes

* DocumentStoreStatus component

* ui changes for doc store

* add remove metadata key field, add custom document loader

* add chatflows used doc store chips

* add types/interfaces to DocumentStore Services

* document loader list dialog title bar color change

* update interfaces

* Whereused Chatflow Name and Added chunkNo to retain order of created chunks.

* use typeorm order chunkNo, ui changes

* update tabler icons react

* cleanup agents

* add pysandbox tool

* add abort functionality, loading next agent

* add empty view svg

* update chatflow tool with chatId

* rename to agentflows

* update worker for prompt input values

* update dashboard to agentflows, agentcanvas

* fix marketplace use template

* add agentflow templates

* resolve merge conflict

* update baseURL

---------

Co-authored-by: vinodkiran <vinodkiran@usa.net>
Co-authored-by: Vinod Paidimarry <vinodkiran@outlook.in>
2024-05-21 16:36:42 +01:00
Vinod Kiran 95f1090bed
BugFix #2386: Double quotes are not escaped, flow crashes (#2448)
Fix for #2386
2024-05-21 12:10:30 +01:00
Henry Heng b50103021c
Feature/Ability to omit all metadata keys using asterisk (#2401)
add ability to omit all metadata keys using asterisk
2024-05-13 16:30:57 +01:00
automaton82 43b22476e3
Fixes 2343 CSV error with no text splitter (#2344) 2024-05-07 01:43:42 +01:00
Vinod Kiran 40e36d1b39
Feature/DocumentStore (#2106)
* datasource: initial commit

* datasource: datasource details and chunks

* datasource: Document Store Node

* more changes

* Document Store - Base functionality

* Document Store Loader Component

* Document Store Loader Component

* before merging the modularity PR

* after merging the modularity PR

* preview mode

* initial draft PR

* fixes

* minor updates and  fixes

* preview with loader and splitter

* preview with credential

* show stored chunks

* preview update...

* edit config

* save, preview and other changes

* save, preview and other changes

* save, process and other changes

* save, process and other changes

* alpha1 - for internal testing

* rerouting urls

* bug fix on new leader create

* pagination support for chunks

* delete document store

* Update pnpm-lock.yaml

* doc store card view

* Update store files to use updated storage functions, Document Store Table View and other changes

* ui changes

* add expanded chunk dialog, improve ui

* change throw Error to InternalError

* Bug Fixes and removal of subFolder, adding of view chunks for store

* lint fixes

* merge changes

* DocumentStoreStatus component

* ui changes for doc store

* add remove metadata key field, add custom document loader

* add chatflows used doc store chips

* add types/interfaces to DocumentStore Services

* document loader list dialog title bar color change

* update interfaces

* Whereused Chatflow Name and Added chunkNo to retain order of created chunks.

* use typeorm order chunkNo, ui changes

---------

Co-authored-by: Henry <hzj94@hotmail.com>
Co-authored-by: Henry Heng <henryheng@flowiseai.com>
2024-05-06 15:23:27 +01:00
Vinod Kiran d5a97060e2
FEATURE: Adding File Upload to Unstructured Loader (#2304)
* initial commit

* updates to loader to support file upload

* updates to loader to support file upload

* update unstructured file

---------

Co-authored-by: Henry <hzj94@hotmail.com>
2024-05-02 18:34:32 +01:00
Henry Heng 2b1273ca31
Bugfix/Missing Filter for VectorStore to Document (#2285)
add filter for vector store to document
2024-04-29 22:25:40 +01:00
Vinod Kiran 7006d64de0
Feature/s3 storage (#2226)
* centralizing file writing....

* allowing s3 as storage option

* allowing s3 as storage option

* update s3 storage

---------

Co-authored-by: Henry <hzj94@hotmail.com>
2024-04-23 11:35:38 +01:00
Quinn 4c2ba109fd
Update unstructured document loaders (#2213)
* Update UnstructuredFile with missing values. Removed deprecated values.

* Update UnstructuredFolder with missing values. Removed deprecated values.

* Added suport for sourceIdKey to unstructured loaders

* Update unstructured hi_res model names

* Update S3File document loader with latest unstructured and model changes

* Update credential method for S3File document loader

* moved pnpm req to engines to avoid minor version changes

* Change unstructured skipInferTableTypes parse to JSON

* Update unstructured with new params. Also fixed list order, missing values, and support for null on multiOptions.
2024-04-21 19:42:28 +01:00
louyongjiu f5be889ea8
Feature: Add pdfUsage parameter setting support to folderFiles (#2211)
* Add pdfUsage parameter setting support to folderFiles

* Add pdfUsage parameter setting additionalParams: true
2024-04-19 11:43:22 +01:00
Vinod Kiran 658fa3984e
Feature/externalize files from chatflow - do not save as base64 (#1976)
* initial commit. Externalizing the file base64 string from flowData

* csv - docloader - Externalizing the file base64 string from flowData

* csv - docloader - Externalizing the file base64 string from flowData

* DocX - docloader - Externalizing the file base64 string from flowData

* Json - docloader - Externalizing the file base64 string from flowData

* Jsonlines - docloader - Externalizing the file base64 string from flowData

* PDF - docloader - Externalizing the file base64 string from flowData

* Vectara - vector store - Externalizing the file base64 string from flowData

* OpenAPIToolkit - tools - Externalizing the file base64 string from flowData

* OpenAPIChain - chain - Externalizing the file base64 string from flowData

* lint fixes

* datasource enabled - initial commit

* CSVAgent - agents - Externalizing the file base64 string from flowData

* Externalizing the file base64 string from flowData

* Externalizing the file base64 string from flowData

* add pnpm-lock.yaml

* update filerepository to add try catch

* Rename FileRepository.ts to fileRepository.ts

---------

Co-authored-by: Henry <hzj94@hotmail.com>
Co-authored-by: Henry Heng <henryheng@flowiseai.com>
2024-04-04 21:41:06 +05:30
Henry Heng b9b0c9d227
Bugfix/Web Scraper Limit (#2083)
fix when limit set to 0, selectedLinks sliced to become empty
2024-04-02 11:14:04 +01:00
Mr Khachaturov 4ca82ee733
Extend Confluence Document Loader to Support Server/Data Center with PAT (#1998)
* Extend Confluence Document Loader to Support Server/Data Center with PAT

- Update authentication to use Personal Access Token (PAT)
- Expand compatibility to include both Confluence Cloud and Server/Data Center

* Update ConfluenceServerDCApi.credential.ts

* use the same confluence loader with different connection logic

* use the same confluence loader with different connection logic

* Apply Prettier formatting
2024-03-25 01:37:55 +08:00
Henry 5a45a99620 Merge branch 'main' into chore/Upgrade-LC-version 2024-02-19 17:39:32 +08:00
Ilango 4e8bf4903d
Merge pull request #1687 from 0xi4o/bug/scrap-limit
Fix: relative links method and limit not applying to manage links
2024-02-12 12:35:40 +05:30
Ilango 5471a4c9aa Show error when relative links method is not set and allow 0 as limit value 2024-02-12 12:01:19 +05:30
Henry caf54bf31b add document json output 2024-02-09 16:07:34 +08:00
Henry 73112ad122 Merge branch 'main' into chore/Upgrade-LC-version
# Conflicts:
#	packages/components/package.json
2024-02-07 02:01:32 +08:00
Ilyes Tascou 19fb13baf0 fix for linting 2024-02-06 14:36:32 +01:00
Ilyes Tascou 2bb2a7588a add recursive option for folder-loader 2024-02-06 14:25:40 +01:00
Ilango c2ae7e138c Apply limit to selectedLinks even when relative links method is not specified 2024-02-06 14:40:19 +05:30
Ilyes Tascou 011a0a75c3 add kotlin files to folder-loader 2024-02-05 17:20:05 +01:00
Henry 7881f295ab update vec 2 doc node 2024-02-01 19:18:14 +00:00
Henry 02fe500f21 Merge branch 'main' into chore/Upgrade-LC-version
# Conflicts:
#	packages/components/nodes/cache/RedisCache/RedisCache.ts
#	packages/components/nodes/cache/RedisCache/RedisEmbeddingsCache.ts
#	packages/components/nodes/chains/ConversationChain/ConversationChain.ts
#	packages/components/nodes/tools/RetrieverTool/RetrieverTool.ts
#	packages/components/nodes/vectorstores/Qdrant/Qdrant.ts
#	packages/components/nodes/vectorstores/Redis/Redis.ts
#	packages/components/nodes/vectorstores/Redis/RedisSearchBase.ts
#	packages/components/nodes/vectorstores/Redis/Redis_Existing.ts
#	packages/components/nodes/vectorstores/Redis/Redis_Upsert.ts
#	packages/components/src/agents.ts
2024-01-31 00:25:37 +00:00
Darien Kindlund b960f061eb Clarifying that the Limit value is ignored when Return All is set to true. 2024-01-28 20:42:19 -08:00
Darien Kindlund 66eef84633 Forgot to make maxRecords optional now 2024-01-28 20:42:19 -08:00
Darien Kindlund 37945fc998 The loadAll() function should ignore any maxRecords specified, because the intention is the load *all* of the records. Also, marking both the Return All and Limit params as optional, so as to not confuse the user. Making them both required adds a lot of confusion that doesn't make sense. Ideally, the user either specifies Return All OR specifies the Limit value but not BOTH. It seems there's no way to define "conditional requirements" in Flowise params, so it's better to make both params optional. 2024-01-28 20:42:19 -08:00
Darien Kindlund dc39d7e2be So Airtable API expects a maxRecords value to be the total set of records you want across multiple API calls. If the maxRecords is greater than 100, then it will provide pagination hints accordingly. 2024-01-28 20:42:19 -08:00
Darien Kindlund 9b71f683ff Support pagination even in loadLimit(), so that if a user wants to load more than 100 records but not all of them, they can. Currently, there's a bug where the document loader doesn't work on loading more than 100 records because of internal Airtable API limitations. The Airtable API can only fetch up to 100 records per call - anything more than that requires pagination. 2024-01-28 20:42:19 -08:00
Darien Kindlund 2237b1ab16 Fix worked, removing debug logging, and bumped node version. 2024-01-28 20:42:19 -08:00
Darien Kindlund 3b788e42e1 When you switch from GET to POST, you're supposed to adjust the Airtable URL to use the /listRecords endpoint. I didn't RTFM clearly. This is currently documented here: https://support.airtable.com/docs/enforcement-of-url-length-limit-for-web-api-requests 2024-01-28 20:42:19 -08:00
Darien Kindlund 456dfabc6f For some reason, Airtable doesn't like the POST operations, so I need to add logging to figure out why this happens 2024-01-28 20:42:18 -08:00
Darien Kindlund 8ae848110e Clarifying the description for the optional fields param 2024-01-28 20:42:18 -08:00
Darien Kindlund 72ec7878b6 Added more error checking and also fixed yet more build errors 2024-01-28 20:42:18 -08:00
Darien Kindlund 1a7cb5a010 Fixing more build errors 2024-01-28 20:42:18 -08:00
Darien Kindlund ae64854bae Fixing a bunch of build errors 2024-01-28 20:42:18 -08:00
Darien Kindlund 71f456af90 Switched to specifying Airtable fields to include rather than exclude - this helps reduce the amount of data fetched by the DocumentLoader when there are massive numbers of fields in an Airtable table. 2024-01-28 20:42:18 -08:00
Darien Kindlund 6f7b7408e1
Merge branch 'FlowiseAI:main' into main 2024-01-28 20:34:00 -08:00
Henry 0606d2c6dd upgrade langchain version 0.1.0 2024-01-26 23:41:55 +00:00
Ilango 94d8e003e7 Merge branch 'main' of github.com:0xi4o/Flowise into feature/scrapped-links 2024-01-26 03:57:21 +05:30
Ilango 193e5c4640 Update console statements to use logger 2024-01-22 08:49:04 +05:30
Ilango c24708f53b Set default value for get links limit in web scraper nodes - cheerio, playwright, and puppeteer 2024-01-22 08:42:44 +05:30
Ilango bf60a1a2a9 Fix multiple calls to parseInt 2024-01-22 08:30:36 +05:30
Darien Kindlund b76c3b27a9
Merge branch 'FlowiseAI:main' into main 2024-01-20 00:01:41 -05:00
Ilango bfa26a72c4 Use selected links if available when scraping in cheerio, puppeteer, and playwright nodes 2024-01-19 14:25:04 +05:30
Henry f26a99ade2 update figma loader 2024-01-17 22:08:04 +00:00
Darien Kindlund e3982476b0 Bumping version 2024-01-04 12:07:25 -05:00
Darien Kindlund 53bfd07694 Bumping version to reflect new feature 2024-01-03 21:23:43 -05:00
Darien Kindlund 66701cec8a Fixing linting issues using 'yarn lint-fix' 2024-01-03 13:35:25 -05:00
Darien Kindlund c035363d6f Fixing linting issues using 'yarn lint-fix' 2024-01-03 13:20:39 -05:00
Darien Kindlund 6006157958 Updated Airtable field exclusion support to use field names instead of field ids 2023-12-30 15:17:33 -05:00
Darien Kindlund 3fb8001907 Added support to exclude specific Airtable Field Ids 2023-12-30 14:13:39 -05:00
Darien Kindlund 0885946bae
Merge branch 'FlowiseAI:main' into main 2023-12-30 11:54:33 -05:00
Darien Kindlund 28e32f0ae6 Initial support for Airtable views 2023-12-30 09:46:44 -05:00
vinodkiran fe0c22255b Bugfix: Upsert successful, but failed to insert documents 2023-12-30 16:17:03 +05:30
Henry 1a4ead3544 update S3 loader 2023-12-22 01:58:40 +00:00
abhishekshankr f5536377d5 Update Icons 2023-12-19 17:53:05 -05:00
abhishekshankr c6842e1cb8 Updated icons 2023-12-18 12:13:22 -05:00
abhishekshankr 886e0af59b Updated document loader icons 2023-12-18 04:22:10 -05:00
abhishekshankr d214ddfe5b Initial Icon Tests 2023-12-13 12:45:00 -05:00
vinodkiran cc1a3101e2 S3 File Loader: Region missing fix 2023-12-06 15:01:30 +05:30
Henry Heng 4f3d352606
Merge pull request #1162 from SearchApi/feature/add-searchapi
Add SearchApi documentloader, tool & podcast QA example
2023-11-02 21:46:13 +00:00
SebastjanPrachovskij a240333e79
Run yarn lint-fix 2023-11-02 22:17:36 +02:00
Henry 689612b0d6 add query to vec2doc 2023-11-02 19:46:52 +00:00
Henry 039d8d26be expand text file supported file types 2023-10-31 16:32:33 +00:00
SebastjanPrachovskij 7bc41939e4
Add SearchApi documentloader, tool & podcast QA example 2023-10-31 16:27:37 +02:00
Henry db06f85c2a add multi options 2023-10-13 01:28:48 +01:00
Henry 21b2ef7f1d add unstructured 2023-10-12 15:31:19 +01:00
Henry 789d8d0001 add s3 loader 2023-10-12 11:56:52 +01:00
Jan Staněk 4d312cb13b version 2023-10-11 08:16:50 +02:00
Jan Staněk 9af7775d0e text output for Text File and Plain Text components 2023-10-10 16:30:46 +02:00
Henry 14f2dcae6a add plain text doc loader 2023-10-06 17:06:00 +01:00
matthias 173b645fe8 Added CSS selector to Cheerio 2023-09-19 19:34:01 +02:00
Henry 06a5a71d26 update apify for metadata 2023-09-08 12:42:26 +01:00
Henry 6372cb9c72 update marketplace 2023-08-30 12:34:50 +01:00
Henry b2627cafa3 update steps 2023-08-30 12:29:32 +01:00
Henry fda49637ce add additional params 2023-08-30 12:28:01 +01:00
Hamzah Abdulfattah 939daff0a1 fix linting 2023-08-20 01:43:35 +01:00
Hamzah Abdulfattah 3b0c5b0c0d implemented serpapi loader integration and added its corresponding template 2023-08-19 17:23:19 +01:00
Henry Heng e9adc15ff2
Merge pull request #775 from FlowiseAI/feature/Vec2Doc
Feature/Vector to Prompt
2023-08-17 14:22:48 +01:00
Atish Amte 888fa356b9 lint fixes 2023-08-17 01:11:31 +05:30
Atish Amte 338082f0aa playwright config 2023-08-17 00:52:35 +05:30
Atish Amte 8414f347de spelling correction 2023-08-17 00:36:03 +05:30
Atish Amte c83d0ab320 added puppeteer options 2023-08-17 00:33:01 +05:30
Henry 6240461a78 Merge branch 'main' into feature/Vec2Doc 2023-08-16 11:33:41 +01:00
Henry 94461025dc add vector to prompt 2023-08-16 01:43:11 +01:00
rkeshwani d10f3800e6 Fixed spaces and comma issue. 2023-08-15 23:36:50 +00:00
rkeshwani f6933b592d Added additional file extensions and removed abstracted inputs. 2023-08-15 00:13:38 +00:00
rkeshwani 177e7f5c0f Add additional optional input parameter for adding additional file loaders. 2023-08-11 00:20:04 +00:00
drobnikj 83d8e96f9c fix: fix credentials and parsing of numbers 2023-08-09 09:49:39 +02:00
drobnikj 5146f6bde3 feat: improve apify content crawler input 2023-08-01 11:19:57 +02:00
drobnikj 3aa301119b feat: add apifyApiToken credentials v1 2023-08-01 10:22:51 +02:00
Jakub Drobník f3d74248dd
Merge branch 'main' into main 2023-07-31 13:15:05 +02:00
Henry 05dd23b01d - update marketplaces
- add version to nodes and credentials
- hover over node actions
2023-07-28 15:56:40 +01:00
Henry 61dabbb7da Merge branch 'main' into feature/Credential
# Conflicts:
#	README.md
#	docker/.env.example
#	packages/components/nodes/documentloaders/Notion/NotionDB.ts
#	packages/components/nodes/memory/DynamoDb/DynamoDb.ts
#	packages/components/nodes/memory/MotorheadMemory/MotorheadMemory.ts
#	packages/components/nodes/memory/ZepMemory/ZepMemory.ts
#	packages/components/package.json
#	packages/components/src/utils.ts
#	packages/server/.env.example
#	packages/server/README.md
#	packages/server/marketplaces/chatflows/Conversational Retrieval QA Chain.json
#	packages/server/src/ChildProcess.ts
#	packages/server/src/DataSource.ts
#	packages/server/src/commands/start.ts
#	packages/server/src/index.ts
#	packages/server/src/utils/index.ts
#	packages/server/src/utils/logger.ts
2023-07-27 11:26:34 +01:00