Make AI-Driven Decisions: Financial Statement Analysis with LangChain & OpenAI

Chris Shayan
8 min readApr 3, 2024

--

Financial decisions can make or break a company’s future. Traditionally, analyzing financial statements has been a time-consuming and detail-oriented process. But what if there was a way to leverage artificial intelligence (AI) to streamline this process and gain deeper insights?

Enter LangChain and OpenAI, a powerful combination that’s revolutionizing financial statement analysis. This article will explore how these AI tools can empower you to make faster, data-driven decisions and unlock valuable financial intelligence.

Real Financial Statement Data

To conduct a comprehensive analysis, I utilized the SEC Edgar database to gather financial data for several leading companies, including Amazon, Nvidia, Apple, and Microsoft. The data encompasses a fourteen-year period, ranging from 2009 to 2022, and includes key financial metrics for each year on a per-company basis.

"Year" BIGINT, 
"Company " TEXT,
"Category" TEXT,
"Market Cap(in B USD)" FLOAT,
"Revenue" FLOAT,
"Gross Profit" FLOAT,
"Net Income" FLOAT,
"Earning Per Share" FLOAT,
"EBITDA" FLOAT,
"Share Holder Equity" FLOAT,
"Cash Flow from Operating" FLOAT,
"Cash Flow from Investing" FLOAT,
"Cash Flow from Financial Activities" FLOAT,
"Current Ratio" FLOAT,
"Debt/Equity Ratio" FLOAT,
"ROE" FLOAT,
"ROA" FLOAT,
"ROI" FLOAT,
"Net Profit Margin" FLOAT,
"Free Cash Flow per Share" FLOAT,
"Return on Tangible Equity" FLOAT,
"Number of Employees" BIGINT,
"Inflation Rate(in US)" FLOAT

Code Explanation

The analysis leveraged pandas to efficiently load the financial data from CSV files. Subsequently, SQLAlchemy facilitated the transfer of this data to a SQLite database for structured storage. Finally, LangChain’s “ChatOpenAI” large language model (LLM) module was employed, fine-tuned with FinGPT for domain-specific financial analysis.

To bridge the gap between the financial data and LangChain’s AI capabilities, the create_sql_agent function was utilized to construct an agent_executor object.

Once the agent_executor was established, I proceeded to interact with LangChain by formulating questions and submitting them through the invoke method (e.g., agent_executor.invoke(“…”)). Here are some examples of the questions posed and the corresponding responses received:

Question: Which company has a better financial performance from 2020 to 2022?

Response:

The table “FinancialStatements20092022” contains financial data for various companies from 2009 to 2022. The relevant columns for determining financial performance could be “Revenue”, “Gross Profit”, “Net Income”, “EBITDA”, “Cash Flow from Operating”, “ROE”, “ROA”, “ROI”, and “Net Profit Margin”.

However, the question does not specify the criteria for “better financial performance”. It could be based on revenue, net income, or any other financial metric.

For simplicity, let’s assume that the company with the highest average net income from 2020 to 2022 has the better financial performance.

Let’s write a SQL query to find the company with the highest average net income from 2020 to 2022.

SELECT "Company ", AVG("Net Income") as Average_Net_Income FROM FinancialStatements20092022 WHERE Year BETWEEN 2020 AND 2022 GROUP BY "Company " ORDER BY Average_Net_Income DESC LIMIT 1
Invoking: sql_db_query with {'query': 'SELECT "Company ", AVG("Net Income") as Average_Net_Income FROM FinancialStatements20092022 WHERE Year BETWEEN 2020 AND 2022 GROUP BY "Company " ORDER BY Average_Net_Income DESC LIMIT 1'}

[(‘AAPL’, 83964.66666666667)]

The company with the best financial performance from 2020 to 2022, based on the highest average net income, is Apple (AAPL) with an average net income of approximately 83,965 million USD.

Question: Which company can afford to take a loan bigger than $1 billion dollar?

The table “FinancialStatements20092022” seems to contain the relevant information. The “Company” column contains the company names, and the “Market Cap(in B USD)” column contains the market capitalization of the companies in billions of dollars.

A company’s ability to afford a loan is typically assessed based on its financial health, which can be indicated by its market capitalization. In this case, we can assume that a company can afford to take a loan bigger than $1 billion if its market capitalization is significantly larger than $1 billion.

Let’s write a query to find companies with a market capitalization larger than $1 billion. We will limit the results to the most recent year available in the data for each company.

SELECT DISTINCT "Company ", "Market Cap(in B USD)" FROM FinancialStatements20092022 WHERE "Market Cap(in B USD)" > 1 ORDER BY Year DESC LIMIT 10
Invoking: sql_db_query with {'query': 'SELECT DISTINCT "Company ", "Market Cap(in B USD)" FROM FinancialStatements20092022 WHERE "Market Cap(in B USD)" > 1 ORDER BY Year DESC LIMIT 10'}

[(‘MSFT’, 2451.23), (‘NVDA’, 1000.35), (‘AAPL’, 2066.94), (‘MSFT’, 1787.73), (‘GOOG’, 1144.35), (‘PYPL’, 81.19), (‘AIG’, 46.99), (‘PCG’, 41.28), (‘MCD’, 186.39), (‘BCS’, 32.53)]

Here are some companies that can afford to take a loan bigger than $1 billion based on their market capitalization:

1. Microsoft (MSFT) with a market cap of $2451.23 billion

2. Nvidia (NVDA) with a market cap of $1000.35 billion

3. Apple (AAPL) with a market cap of $2066.94 billion

4. Google (GOOG) with a market cap of $1144.35 billion

5. McDonald’s (MCD) with a market cap of $186.39 billion

Please note that the ability to afford a loan also depends on other factors such as the company’s debt levels, cash flow, and profitability, which are not considered in this analysis.

Question: I am a bank and I want to lend a loan to these companies, which one do you think is credible and why?

Invoking: sql_db_query_checker with {‘query’: ‘SELECT Company , Net Income`, Debt/Equity Ratio, Current Ratio, ROE, ROA, ROI FROM FinancialStatements20092022 WHERE Year = 2022 ORDER BY Net Income DESC, Debt/Equity Ratio ASC, Current Ratio DESC, ROE DESC, ROA DESC, ROI DESC LIMIT 10’}`

responded: Based on the schema of the FinancialStatements20092022 table, we can evaluate the credibility of a company for a loan based on several financial indicators such as:

1. Net Income: A higher net income indicates that the company is profitable.

2. Debt/Equity Ratio: A lower ratio is preferred as it indicates that the company is not heavily reliant on debt to finance its operations.

3. Current Ratio: A ratio above 1 indicates that the company can pay off its short-term liabilities with its short-term assets.

4. Return on Equity (ROE), Return on Assets (ROA), and Return on Investment (ROI): Higher values for these ratios indicate that the company is using its resources effectively to generate profits.

Let’s query the top 10 companies based on these indicators. We will consider the most recent year (2022) for this analysis.

SELECT Company , Net Income, Debt/Equity Ratio, Current Ratio, ROE, ROA, ROI FROM FinancialStatements20092022 WHERE Year = 2022 ORDER BY Net Income DESC, Debt/Equity Ratio ASC, Current Ratio DESC, ROE DESC, ROA DESC, ROI DESC LIMIT 10
Invoking: sql_db_query with {'query': 'SELECT Company , Net Income`, Debt/Equity Ratio, Current Ratio, ROE, ROA, ROI FROM FinancialStatements20092022 WHERE Year = 2022 ORDER BY Net Income DESC, Debt/Equity Ratio ASC, Current Ratio DESC, ROE DESC, ROA DESC, ROI DESC LIMIT 10'}`
[('AAPL', 99803.0, 2.3695, 0.8794, 196.9589, 28.2924, 66.6994), ('MSFT', 72738.0, 0.2989, 1.7846, 43.6755, 19.937, 34.0575), ('GOOG', 59972.0, 0.0574, 2.378, 23.4134, 16.4188, 22.1426), ('AIG', 10247.0, 0.6435, 1.0, 27.0084, 2.1411, 16.2446), ('NVDA', 9752.0, 0.4113, 6.6503, 36.6451, 22.0698, 25.9652), ('INTC', 8014.0, 0.4071, 1.5676, 7.7619, 4.4025, 5.687), ('BCS', 6212.949, 7.1602, 1.0, 8.624, 0.3946, 1.284), ('MCD', 6177.4, -5.9805, 1.4266, -102.8983, 12.2481, 20.6601), ('PYPL', 2419.0, 0.5138, 1.2753, 11.9315, 3.073, 7.8818), ('PCG', 1800.0, 2.2563, 0.8117, 7.8613, 1.5289, 2.5615)]Here are the top 10 companies based on their financial indicators for the year 2022:

1. AAPL (Apple Inc.): Net Income — 99,803, Debt/Equity Ratio — 2.37, Current Ratio — 0.88, ROE — 196.96, ROA — 28.29, ROI — 66.70

2. MSFT (Microsoft Corporation): Net Income — 72,738, Debt/Equity Ratio — 0.30, Current Ratio — 1.78, ROE — 43.68, ROA — 19.94, ROI — 34.06

3. GOOG (Google LLC): Net Income — 59,972, Debt/Equity Ratio — 0.06, Current Ratio — 2.38, ROE — 23.41, ROA — 16.42, ROI — 22.14

4. AIG (American International Group): Net Income — 10,247, Debt/Equity Ratio — 0.64, Current Ratio — 1.00, ROE — 27.01, ROA — 2.14, ROI — 16.24

5. NVDA (NVIDIA Corporation): Net Income — 9,752, Debt/Equity Ratio — 0.41, Current Ratio — 6.65, ROE — 36.65, ROA — 22.07, ROI — 25.97

6. INTC (Intel Corporation): Net Income — 8,014, Debt/Equity Ratio — 0.41, Current Ratio — 1.57, ROE — 7.76, ROA — 4.40, ROI — 5.69

7. BCS (Barclays PLC): Net Income — 6,213, Debt/Equity Ratio — 7.16, Current Ratio — 1.00, ROE — 8.62, ROA — 0.39, ROI — 1.28

8. MCD (McDonald’s Corporation): Net Income — 6,177, Debt/Equity Ratio — -5.98, Current Ratio — 1.43, ROE — -102.90, ROA — 12.25, ROI — 20.66

9. PYPL (PayPal Holdings, Inc.): Net Income — 2,419, Debt/Equity Ratio — 0.51, Current Ratio — 1.28, ROE — 11.93, ROA — 3.07, ROI — 7.88

10. PCG (Pacific Gas and Electric Company): Net Income — 1,800, Debt/Equity Ratio — 2.26, Current Ratio — 0.81, ROE — 7.86, ROA — 1.53, ROI — 2.56

Based on these indicators, Apple Inc. (AAPL), Microsoft Corporation (MSFT), and Google LLC (GOOG) appear to be the most credible companies for a loan due to their high net income, reasonable debt/equity ratios, and effective use of resources as indicated by their ROE, ROA, and ROI. However, a more comprehensive analysis would consider additional factors and risks specific to each company and industry.

Question: Please do a sector analysis on NVDA

To perform a sector analysis, we can compare NVDA’s financial performance with other companies in the same sector. The relevant columns for this analysis could be “Company”, “Category”, “Revenue”, “Gross Profit”, “Net Income”, “EBITDA”, “ROE”, “ROA”, “ROI”, “Net Profit Margin”, and “Number of Employees”.

Comparing these values with the sector averages, we can see that NVDA has lower revenue, gross profit, net income, EBITDA, ROI, and net profit margin than the sector average. However, NVDA’s ROE and ROA are comparable to the sector average. Also, NVDA operates with fewer employees than the sector average.

Question: Please do a Market Capitalization Analysis on NVDA

The market capitalization of NVDA over the last 10 years is as follows:

- 2023: 1000.35 B USD

- 2022: 359.5 B USD

- 2021: 735.86 B USD

- 2020: 323.24 B USD

- 2019: 144.0 B USD

- 2018: 81.44 B USD

- 2017: 117.26 B USD

- 2016: 57.53 B USD

- 2015: 17.73 B USD

- 2014: 10.9 B USD

Question: Explain to me what is Current Ratio analysis and why is it important? Also perform it on NVDA and show how you did it.

Question: Explain to me what is dupont analysis and why is it important? Also perform it on NVDA and show how you did it.

Question: Run all financial ratio tests for Nvidia.

Summary

In conclusion, this exploration has demonstrated the power of combining LangChain and OpenAI for financial statement analysis. By leveraging the SEC Edgar database and utilizing pandas and SQLAlchemy for data manipulation, we were able to construct a robust foundation for AI-driven insights.

LangChain’s “ChatOpenAI” module, fine-tuned with FinGPT, proved adept at handling various financial tasks, as exemplified by its ability to perform tasks like DuPont analysis, assess loan eligibility through business banking underwriting processes, and analyze current ratios. This successful integration opens doors for further exploration of LangChain’s capabilities in the financial domain.

--

--

Chris Shayan

Head of AI at Backbase The postings on this site are my own and do not necessarily represent the postings, strategies or opinions of my employer.