Glean’s official LangChain integration enables you to build powerful AI agents that can search and reason over your organization’s knowledge using Python and the LangChain framework.
You’ll need Glean API credentials, and specifically a user-scoped API token. API Tokens require the following scopes: chat, search. You should speak to your Glean administrator to provision these tokens.
Configure your Glean credentials by setting the following environment variables:
export GLEAN_SUBDOMAIN="your-glean-subdomain"export GLEAN_API_TOKEN="your-glean-api-token"export GLEAN_ACT_AS="user@example.com" # Optional: Email to act as when making requests
The GleanSearchRetriever allows you to search and retrieve documents from Glean:
from langchain_glean.retrievers import GleanSearchRetriever# Initialize the retriever (will use environment variables)retriever = GleanSearchRetriever()# Search for documentsdocuments = retriever.invoke("quarterly sales report")# Process the resultsfor doc in documents: print(f"Title: {doc.metadata.get('title')}") print(f"URL: {doc.metadata.get('url')}") print(f"Content: {doc.page_content}") print("---")
The GleanSearchTool can be used in LangChain agents to search Glean:
from langchain_core.prompts import ChatPromptTemplatefrom langchain_openai import ChatOpenAIfrom langchain.agents import AgentExecutor, create_openai_tools_agentfrom langchain_glean.retrievers import GleanSearchRetrieverfrom langchain_glean.tools import GleanSearchTool# Initialize the retrieverretriever = GleanSearchRetriever()# Create the toolglean_tool = GleanSearchTool( retriever=retriever, name="glean_search", description="Search for information in your organization's content using Glean.")# Create an agent with the toolllm = ChatOpenAI(model="gpt-4o")prompt = ChatPromptTemplate.from_messages([ ("system", "You are a helpful assistant with access to Glean search."), ("user", "{input}")])agent = create_openai_tools_agent(llm, [glean_tool], prompt)agent_executor = AgentExecutor(agent=agent, tools=[glean_tool])# Run the agentresponse = agent_executor.invoke({"input": "Find the latest quarterly report"})print(response["output"])
You can integrate the retriever with LangChain chains for more complex workflows:
from langchain_core.output_parsers import StrOutputParserfrom langchain_core.prompts import ChatPromptTemplatefrom langchain_core.runnables import RunnablePassthroughfrom langchain_openai import ChatOpenAIfrom langchain_glean.retrievers import GleanSearchRetriever# Initialize the retrieverretriever = GleanSearchRetriever()# Create a prompt templateprompt = ChatPromptTemplate.from_template( """Answer the question based only on the context provided.Context: {context}Question: {question}""")# Initialize the language modelllm = ChatOpenAI(model="gpt-4o")# Format documents functiondef format_docs(docs): return "\n\n".join(doc.page_content for doc in docs)# Create the chainchain = ( {"context": retriever | format_docs, "question": RunnablePassthrough()} | prompt | llm | StrOutputParser())# Run the chainresult = chain.invoke("What were our Q2 sales results?")print(result)