Skip to content

refactor: optimize the process of graph rag#101

Merged
imbajin merged 13 commits intoapache:mainfrom
ChenZiHong-Gavin:optimize-graph-rag
Nov 14, 2024
Merged

refactor: optimize the process of graph rag#101
imbajin merged 13 commits intoapache:mainfrom
ChenZiHong-Gavin:optimize-graph-rag

Conversation

@ChenZiHong-Gavin
Copy link
Copy Markdown
Collaborator

@ChenZiHong-Gavin ChenZiHong-Gavin commented Oct 28, 2024

  • replace print with log
  • delete single quote in keywords
  • special handling is required when keywords list is empty
  • regarding the issue of duplicate vid attributes: compare the effectiveness of soltions
  • null value detection on vid attributes to reduce unnecessary token consumption
  • add few shot on KEYWORDS EXTRACTION and KEYWORDS EXPAND prompt template

@github-actions github-actions Bot added the llm label Oct 28, 2024
@ChenZiHong-Gavin
Copy link
Copy Markdown
Collaborator Author

image
As can be seen in fig, single quotes may lead to gremlin error. (however, double quotes won't)

@ChenZiHong-Gavin ChenZiHong-Gavin marked this pull request as ready for review November 5, 2024 08:37
@dosubot dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 5, 2024
Comment thread hugegraph-llm/src/hugegraph_llm/operators/hugegraph_op/graph_rag_query.py Outdated
subgraph = set()
subgraph_with_degree = {}
vertex_degree_list: List[Set[str]] = []
global_vertex_cache: Set[str] = set()
Copy link
Copy Markdown
Member

@imbajin imbajin Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a short name like v_cache or v_set? same as e_cache

Copy link
Copy Markdown
Collaborator Author

@ChenZiHong-Gavin ChenZiHong-Gavin Nov 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a short name like v_cache or v_set? same as e_cache

v_cache and e_cache seem good

@dosubot dosubot Bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:M This PR changes 30-99 lines, ignoring generated files. labels Nov 14, 2024
@imbajin imbajin merged commit 6a8a379 into apache:main Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llm size:L This PR changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants