Comparing and evaluating ChatGPT’s performance giving financial advice with Reddit questions and answers

Sathvik Samant (1), Aditya Dhar (2), Shreya Kochar (3), Aneesha Sreerama (4), Andrew Wang (5), Anirudh Sreerama (6)

(1) The Lawrenceville School, (2) Fairview High School, (3) Columbia University, (4) Northeastern University, (5) Stony Brook University, (6) University of California Berkely

Aug 13, 2024

https://doi.org/10.59720/23-296

As artificial intelligence (AI) and particularly Large Language Models (LLMs) rapidly advance, there is a growing interest in AI in the financial industry while understanding its impact on the future of financial advising. To evaluate the performance of such LLMs in the role of a financial advisor, this experiment utilized financial questions asked on the Reddit forum “r/Financial Planning”. We hypothesized that Chat-GPT would offer commonly observed, yet concise feedback related to typical financial behaviors rather than delivering personalized financial advice. We compared the GPT-4 outputs to actual Reddit comments, assessing the model’s response content, length, and advice. By evaluating the model’s advisory competency, this study explored the role of AI in financial forums, its ethical consequences, and potential threat to employment and existing systems. We found that while AI can present accurate information, it failed in its delivery, clarity, and decisiveness. This study further analyzed the implications of GPT-4’s performance and its impact on future financial forum systems. More broadly, this study revealed that at its current capabilities GPT-4 does not pose a direct threat to traditional financial forums but has the potential in the future to shift financial forums and advisories to more AI-based systems.