Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

요약 실패, Input Token 적은 경우에 대한 대응 #103

Merged
merged 8 commits into from
Sep 23, 2024

Conversation

J-Hoplin
Copy link
Collaborator

@J-Hoplin J-Hoplin commented Sep 15, 2024

PR 내용

추가 및 변경 사항

  • OpenAI Input Text Tokenizer 계산기 추가
  • 요약 실패하거나, Input Token개수가 적은 경우(우선 300으로 해둠) success: false 플래그와 함께 thumbnail description으로 대체되도록 변경

PR 중점사항

안되는 링크들도 있는데 이거는 각 사이트마다 클라우드 프로바이더 IP를 막아서 그런거같아용...
https://stackoverflow.com/questions/59568584/cannot-reach-some-websites-within-aws-lambda-function

스크린샷

@github-actions github-actions bot added document 문서화 관련 작업 수정 및 생성 feature labels Sep 15, 2024
Copy link
Collaborator

@hye-on hye-on left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

고생했어!! 👍😀

Comment on lines +30 to +34
tokenCount += 4;
tokenCount += encoder.encode(message.role).length;
tokenCount += encoder.encode(message.content).length;
}
tokenCount += 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@J-Hoplin
구문에서 4랑 2 더해주는 이유가 궁금쓰~

그리구 프롬프트에 넣는 role length는 넣은 이유가있어?
파라메터로 받는 content 내용(url 내용 파싱한거)추출 length만으로 판단하면 직관적이고 util 함수도 필요없을 것 같은데!

만약에 필요하다면 reduce를 사용해서 tokenCount 리턴하는것도 좋을듯~

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

나도 궁금쓰~

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenAI에서 프롬프팅 할때 Function, Role 담은 JSON을 Stringify해서 보내기때문에 그거랑 동일하게 연산하기 위해서 계산하는거로 공유 완료!

@@ -1 +1,2 @@
export const IS_LOCAL = process.env.NODE_ENV === 'local';
export const DEFAULT_FOLDER_NAME = '나중에 읽을 링크';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

굿~

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

구웃~~

tsconfig.json Outdated
@@ -6,6 +6,7 @@
"emitDecoratorMetadata": true,
"experimentalDecorators": true,
"allowSyntheticDefaultImports": true,
"resolveJsonModule": true,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

json읽어올 일이 있었던건가?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

아 원래 js-tiktoken 말고 tiktoken 사용하려했는데 거기서 JSON을 직접 import 해야되는게 있었어! 지금은 다른 패키지니 지워도 될듯 합니다 ㅎ

@@ -14,6 +15,7 @@ import { SummarizeURLContent } from './types/types';
@Injectable()
export class AiService {
private openai: OpenAI;
private leastTokenThreshold = 300;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

클래스 내의 한 메소드에서만 쓰는데 클래스의 private 멤버 변수로 선언한 이유가 궁금합니당

Copy link
Collaborator Author

@J-Hoplin J-Hoplin Sep 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

그러겡ㅎㅎ 인스턴스 프로퍼티로 들고있을 필요는 없을꺼같아서 변경했어!

@J-Hoplin J-Hoplin merged commit 541dfa5 into develop Sep 23, 2024
1 check passed
@J-Hoplin J-Hoplin deleted the feat/input-token-count branch September 23, 2024 15:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
document 문서화 관련 작업 수정 및 생성 feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants