add in haystack.deepset.ai

qiyanjun · Feb 27, 2024 · 0a7855e · 0a7855e
1 parent dd02df4
commit 0a7855e
Show file tree

Hide file tree

Showing 9 changed files with 44 additions and 30 deletions.
diff --git a/_contents/S0-L10.md b/_contents/S0-L10.md
@@ -7,7 +7,7 @@ extraContent:
 notes: team-2
 video: team-5
 tags:
-- 1Basic
+- Bias
 ---
 
 In this session, our readings cover: 

diff --git a/_contents/S0-L11.md b/_contents/S0-L11.md
@@ -7,7 +7,7 @@ extraContent:
 notes: team-3
 video: team-1
 tags:
-- 1Basic
+- Safety
 ---
 
 In this session, our readings cover: 

diff --git a/_contents/S0-L12.md b/_contents/S0-L12.md
@@ -2,12 +2,12 @@
 layout: post
 title: LLM multimodal / multilingual harm responses  
 lecture: 
-lectureVersion: next
+lectureVersion: current
 extraContent: 
 notes: team-4
 video: team-3
 tags:
-- 1Basic
+- Safety
 ---
 
 In this session, our readings cover: 

diff --git a/_contents/S0-L13.md b/_contents/S0-L13.md
@@ -7,7 +7,7 @@ extraContent:
 notes: team-5
 video: team-3
 tags:
-- 1Basic
+- Safety
 ---
 
 In this session, our readings cover: 
@@ -27,10 +27,6 @@ In this session, our readings cover:
 
 
 
-### Managing Existential Risk from AI without Undercutting Innovation
-  + https://www.csis.org/analysis/managing-existential-risk-ai-without-undercutting-innovation
-
-
 
 
 ### A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
@@ -58,3 +54,8 @@ In this session, our readings cover:
 
 
 
+### Managing Existential Risk from AI without Undercutting Innovation
+  + https://www.csis.org/analysis/managing-existential-risk-ai-without-undercutting-innovation
+
+
+
diff --git a/_contents/S0-L15.md b/_contents/S0-L15.md
@@ -23,14 +23,6 @@ In this session, our readings cover:
 
 ## More Readings: 
 
-### Do Language Models Know When They're Hallucinating References?
-  + https://arxiv.org/abs/2305.18248
-
-### Survey of Hallucination in Natural Language Generation
-  + https://arxiv.org/abs/2202.03629
-
-### Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
-  + https://arxiv.org/abs/2308.05374
 
 ### LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond
   + https://arxiv.org/abs/2305.14540
@@ -39,4 +31,12 @@ In this session, our readings cover:
 
 
 
+### Do Language Models Know When They're Hallucinating References?
+  + https://arxiv.org/abs/2305.18248
+
+### Survey of Hallucination in Natural Language Generation
+  + https://arxiv.org/abs/2202.03629
+
+### Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment
+  + https://arxiv.org/abs/2308.05374
 
diff --git a/_contents/S0-L16.md b/_contents/S0-L16.md
@@ -14,15 +14,10 @@ In this session, our readings cover:
 
 ## Required Readings: 
 
-### BloombergGPT: A Large Language Model for Finance
-  + https://arxiv.org/abs/2303.17564
-  + The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific dataset yet, augmented with 345 billion tokens from general purpose datasets. We validate BloombergGPT on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage. Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks. Additionally, we explain our modeling choices, training process, and evaluation methodology. We release Training Chronicles (Appendix C) detailing our experience in training BloombergGPT.
-
-## More Readings: 
-
 ### Large language models generate functional protein sequences across diverse families
   + https://pubmed.ncbi.nlm.nih.gov/36702895/
 
+## More Readings: 
 
 ### FunSearch: Making new discoveries in mathematical sciences using Large Language Models
   + https://deepmind.google/discover/blog/funsearch-making-new-discoveries-in-mathematical-sciences-using-large-language-models/
@@ -47,3 +42,9 @@ In this session, our readings cover:
 
 
 ### ConvNets Match Vision Transformers at Scale 
+
+
+### BloombergGPT: A Large Language Model for Finance
+  + https://arxiv.org/abs/2303.17564
+  + The use of NLP in the realm of financial technology is broad and complex, with applications ranging from sentiment analysis and named entity recognition to question answering. Large Language Models (LLMs) have been shown to be effective on a variety of tasks; however, no LLM specialized for the financial domain has been reported in literature. In this work, we present BloombergGPT, a 50 billion parameter language model that is trained on a wide range of financial data. We construct a 363 billion token dataset based on Bloomberg's extensive data sources, perhaps the largest domain-specific dataset yet, augmented with 345 billion tokens from general purpose datasets. We validate BloombergGPT on standard LLM benchmarks, open financial benchmarks, and a suite of internal benchmarks that most accurately reflect our intended usage. Our mixed dataset training leads to a model that outperforms existing models on financial tasks by significant margins without sacrificing performance on general LLM benchmarks. Additionally, we explain our modeling choices, training process, and evaluation methodology. We release Training Chronicles (Appendix C) detailing our experience in training BloombergGPT.
+
diff --git a/_contents/S0-L21.md b/_contents/S0-L21.md
@@ -22,10 +22,8 @@ In this session, our readings cover:
   ## More Readings: 
 
 
-
-###  Orca 2: Teaching Small Language Models How to Reason / 
-+ https://arxiv.org/abs/2311.11045
-
+### Long context prompting for Claude 2.1
++ https://www.anthropic.com/news/claude-2-1-prompting
 
 ### Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding
 + This work aims at decreasing the end-to-end generation latency of large language models (LLMs). One of the major causes of the high generation latency is the sequential decoding approach adopted by almost all state-of-the-art LLMs. In this work, motivated by the thinking and writing process of humans, we propose Skeleton-of-Thought (SoT), which first guides LLMs to generate the skeleton of the answer, and then conducts parallel API calls or batched decoding to complete the contents of each skeleton point in parallel. Not only does SoT provide considerable speed-ups across 12 LLMs, but it can also potentially improve the answer quality on several question categories. SoT is an initial attempt at data-centric optimization for inference efficiency, and further underscores the potential of pushing LLMs to think more like a human for answer quality.

diff --git a/_contents/S0-L22.md b/_contents/S0-L22.md
@@ -29,3 +29,7 @@ In this session, our readings cover:
 + https://arxiv.org/abs/2203.11171
 + Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks. In this paper, we propose a new decoding strategy, self-consistency, to replace the naive greedy decoding used in chain-of-thought prompting. It first samples a diverse set of reasoning paths instead of only taking the greedy one, and then selects the most consistent answer by marginalizing out the sampled reasoning paths. Self-consistency leverages the intuition that a complex reasoning problem typically admits multiple different ways of thinking leading to its unique correct answer. Our extensive empirical evaluation shows that self-consistency boosts the performance of chain-of-thought prompting with a striking margin on a range of popular arithmetic and commonsense reasoning benchmarks, including GSM8K (+17.9%), SVAMP (+11.0%), AQuA (+12.2%), StrategyQA (+6.4%) and ARC-challenge (+3.9%).
 
+
+
+###  Orca 2: Teaching Small Language Models How to Reason / 
++ https://arxiv.org/abs/2311.11045
diff --git a/_contents/S0-L24.md b/_contents/S0-L24.md
@@ -19,22 +19,32 @@ In this session, our readings cover:
 + https://arxiv.org/abs/2312.15234
 + In the rapidly evolving landscape of artificial intelligence (AI), generative large language models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However, the computational intensity and memory consumption of deploying these models present substantial challenges in terms of serving efficiency, particularly in scenarios demanding low latency and high throughput. This survey addresses the imperative need for efficient LLM serving methodologies from a machine learning system (MLSys) research perspective, standing at the crux of advanced AI innovations and practical system optimizations. We provide in-depth analysis, covering a spectrum of solutions, ranging from cutting-edge algorithmic modifications to groundbreaking changes in system designs. The survey aims to provide a comprehensive understanding of the current state and future directions in efficient LLM serving, offering valuable insights for researchers and practitioners in overcoming the barriers of effective LLM deployment, thereby reshaping the future of AI.
 
+
 ### Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling
 + https://arxiv.org/abs/2304.01373
 + How do large language models (LLMs) develop and evolve over the course of training? How do these patterns change as models scale? To answer these questions, we introduce \textit{Pythia}, a suite of 16 LLMs all trained on public data seen in the exact same order and ranging in size from 70M to 12B parameters. We provide public access to 154 checkpoints for each one of the 16 models, alongside tools to download and reconstruct their exact training dataloaders for further study. We intend \textit{Pythia} to facilitate research in many areas, and we present several case studies including novel results in memorization, term frequency effects on few-shot performance, and reducing gender bias. We demonstrate that this highly controlled setup can be used to yield novel insights toward LLMs and their training dynamics. Trained models, analysis code, training code, and training data can be found at \url{this https URL}.
 
 ## More Readings: 
 
-### OpenMoE
-  + https://github.com/XueFuzhao/OpenMoE
-
 
 ### Langchain:
   + https://python.langchain.com/docs/get_started/introduction
 
 
+### haystack.deepset.ai
++ https://github.com/deepset-ai/haystack
++ LLM orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
+
+
+
 
 ### LlamaIndex  
   + https://docs.llamaindex.ai/en/stable/
   LlamaIndex supports Retrieval-Augmented Generation (RAG). Instead of asking LLM to generate an answer immediately, LlamaIndex:
  retrieves information from your data sources first, / adds it to your question as context, and / asks the LLM to answer based on the enriched prompt.
+
+
+ ### OpenMoE
+  + https://github.com/XueFuzhao/OpenMoE
+
+
-Original file line number
+Diff line change
@@ Expand Up / @@ -7,7 +7,7 @@ extraContent: @@
     notes: team-2
     video: team-5
     tags:
-    - 1Basic
+    - Bias
     ---
     In this session, our readings cover:
@@ Expand Down @@