๐ŸŽ ์‹ํ’ˆ์—์„œ AI ๊ณต๋ถ€ํ•˜๊ธฐ

Factorization Machine (FM) ๋ณธ๋ฌธ

Food_Health_AI/์ถ”์ฒœ ์‹œ์Šคํ…œ

Factorization Machine (FM)

FoodAI 2025. 3. 30. 22:11

๐Ÿ’ก ๋“ค์–ด๊ฐ€๋ฉฐ

์ถ”์ฒœ ์‹œ์Šคํ…œ์€ ํ˜„๋Œ€ ๋””์ง€ํ„ธ ํ™˜๊ฒฝ์—์„œ ์‚ฌ์šฉ์ž ๊ฒฝํ—˜์„ ๊ฐœ์ธํ™”ํ•˜๋Š” ํ•ต์‹ฌ ๊ธฐ์ˆ ์ž…๋‹ˆ๋‹ค. ํŠนํžˆ ์‹ํ’ˆ ๋ถ„์•ผ์—์„œ ๊ฐœ์ธ์˜ ์ทจํ–ฅ๊ณผ ์˜์–‘ ์š”๊ตฌ๋ฅผ ๊ณ ๋ คํ•œ ์ถ”์ฒœ์€ ๊ฑด๊ฐ•ํ•œ ์‹์ƒํ™œ์„ ์ง€์›ํ•˜๋Š” ์ค‘์š”ํ•œ ์š”์†Œ๊ฐ€ ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ์˜ค๋Š˜์€ 2010๋…„์— ๋ฐœํ‘œ๋œ Factorization Machine(FM)์— ๋Œ€ํ•ด ์‚ดํŽด๋ณด๊ณ , ์ด ๋ชจ๋ธ์ด ์‹ํ’ˆ ํŠน์„ฑ ์ •๋ณด๋ฅผ ํ•จ๊ป˜ ํ™œ์šฉํ•˜์—ฌ ์–ด๋–ป๊ฒŒ ์ •๊ตํ•œ ์ถ”์ฒœ์„ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•˜๋Š”์ง€ ์•Œ์•„๋ณด๊ฒ ์Šต๋‹ˆ๋‹ค.


I. Matrix Factorization์˜ ์ดํ•ด์™€ ํ•œ๊ณ„

Matrix Factorization(MF)๋Š” User์™€ Item ๊ฐ„์˜ ํ‰๊ฐ€ ์ •๋ณด๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” Rating Matrix๋ฅผ User Latent Matrix์™€ Item Latent Matrix๋กœ ๋ถ„ํ•ดํ•˜๋Š” ๊ธฐ๋ฒ•์ž…๋‹ˆ๋‹ค.

Rating Matrix๋Š” (User์˜ ์ˆ˜) × (Item์˜ ์ˆ˜)๋กœ ๊ตฌ์„ฑ๋œ ํ–‰๋ ฌ์ธ๋ฐ, ๊ฐ ์นธ์—๋Š” ๊ฐ ์œ ์ €๊ฐ€ ๊ธฐ๋กํ•œ ํ•ด๋‹น ์•„์ดํ…œ์— ๋Œ€ํ•œ ํ‰๊ฐ€๊ฐ€ ์ˆ˜์น˜๋กœ ๊ธฐ๋ก๋ฉ๋‹ˆ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ ๊ฒฝ์šฐ ๋ชจ๋“  ์œ ์ €๊ฐ€ ๋ชจ๋“  ์•„์ดํ…œ์„ ํ‰๊ฐ€ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— Rating Matrix๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ Sparse Matrix(ํฌ์†Œ ํ–‰๋ ฌ)๊ฐ€ ๋ฉ๋‹ˆ๋‹ค. MF๋Š” ์ด๋Ÿฌํ•œ ํ–‰๋ ฌ ๋ถ„ํ•ด ๊ณผ์ •์—์„œ ๋นˆ์นธ์„ ์ฑ„์šธ๋งŒํ•œ ํ‰์ ์„ ์˜ˆ์ธกํ•˜๋Š” ๊ณผ์ •์ด๋ผ๊ณ  ๋ณผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

ํ‰๊ฐ€ ๋ฐ์ดํ„ฐ๋Š” Explicit Feedback(๋ช…์‹œ์  ์ง€ํ‘œ)์œผ๋กœ ํ‰์ ์ด๋‚˜ ๋ณ„์ ์ด ๋  ์ˆ˜๋„ ์žˆ์ง€๋งŒ, Implicit Feedback(์•”์‹œ์  ์ง€ํ‘œ) ์—ญ์‹œ ์ถฉ๋ถ„ํžˆ ํ™œ์šฉ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

Rating Matrix๋Š” MF๋ฅผ ํ†ตํ•ด ๋‹ค์Œ๊ณผ ๊ฐ™์ด ๋‘ ๊ฐœ์˜ ํ–‰๋ ฌ๋กœ ๋‚˜๋ˆ„์–ด์ง‘๋‹ˆ๋‹ค:

  • User Latent Matrix(U) = (User์˜ ์ˆ˜) × K
  • Item Latent Matrix(I) = (Item์˜ ์ˆ˜) × K
  • Rating Matrix(R) = (User × K) × (K × Item) => User × Item

์—ฌ๊ธฐ์„œ K๋Š” ์ž ์žฌ ์š”์ธ์˜ ์ฐจ์›์„ ์˜๋ฏธํ•˜๋ฉฐ, ์ด ๋‘ ํ–‰๋ ฌ ์ค‘ ํ•˜๋‚˜๋ฅผ ์ „์น˜(Transpose)ํ•˜์—ฌ ํ–‰๋ ฌ๊ณฑ์„ ์ˆ˜ํ–‰ํ•˜๋ฉด ์›๋ž˜์˜ Rating Matrix ํ˜•ํƒœ๋กœ ๋Œ์•„์˜ค๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

Latent Matrix์˜ ๊ฐ ํ–‰(row)์€ ์œ ์ €๋‚˜ ์•„์ดํ…œ์˜ ์ž ์žฌ์  ์ •๋ณด๋ฅผ ์ €์žฅํ•˜๋Š” ๋ฒกํ„ฐ๋กœ, User Matrix์˜ ์œ ์ €(u) ํ–‰๊ณผ Item Matrix์˜ ์•„์ดํ…œ(i) ํ–‰์˜ ๋‚ด์ ์„ ์ˆ˜ํ–‰ํ•˜๋ฉด "์œ ์ € u์˜ ์•„์ดํ…œ i์— ๋Œ€ํ•œ ํ‰๊ฐ€"๊ฐ€ ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค.

>> Matrix Factorization์˜ ํ•œ๊ณ„

๊ธฐ์กด Matrix Factorization์€ ์œ ์ €์™€ ์•„์ดํ…œ, ๊ทธ๋ฆฌ๊ณ  ํ‰์  ์ •๋ณด๋งŒ์„ ํ™œ์šฉํ•ฉ๋‹ˆ๋‹ค. ์ฆ‰, ์œ ์ €์™€ ์•„์ดํ…œ์˜ ๋‹ค์–‘ํ•œ ํŠน์„ฑ ์ •๋ณด๋Š” ํ™œ์šฉํ•˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค. ์‹ํ’ˆ ๋ถ„์•ผ์—์„œ๋Š” ์‹์žฌ๋ฃŒ์˜ ์˜์–‘ ์„ฑ๋ถ„, ์กฐ๋ฆฌ๋ฒ•, ์•Œ๋ ˆ๋ฅด๊ธฐ ์ •๋ณด ๋“ฑ ๋‹ค์–‘ํ•œ ํŠน์„ฑ์ด ์ค‘์š”ํ•œ๋ฐ, ์ด๋Ÿฌํ•œ ์ •๋ณด๋ฅผ ์ถ”์ฒœ ์‹œ์Šคํ…œ์— ๋ฐ˜์˜ํ•˜๊ธฐ ์–ด๋ ต๋‹ค๋Š” ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.


II. ์ถ”์ฒœ ์‹œ์Šคํ…œ์˜ ๋„์ „ ๊ณผ์ œ

SVM ํ™œ์šฉ์˜ ์–ด๋ ค์›€

SVM(Support Vector Machine)์€ General Predictor๋กœ์„œ ๋ฐ์ดํ„ฐ ํ˜•ํƒœ์— ํฌ๊ฒŒ ๊ทœ์ œ๋ฐ›์ง€ ์•Š๊ณ  ๋ถ„๋ฅ˜, ํšŒ๊ท€ ๋“ฑ ๋‹ค์–‘ํ•œ ์ž‘์—…์„ ์ˆ˜ํ–‰ํ•  ์ˆ˜ ์žˆ์–ด ๋จธ์‹ ๋Ÿฌ๋‹ ๋ฐ ๋ฐ์ดํ„ฐ ๋งˆ์ด๋‹ ๋ถ„์•ผ์—์„œ ์ž์ฃผ ์‚ฌ์šฉ๋ฉ๋‹ˆ๋‹ค.

ํ•˜์ง€๋งŒ ์ถ”์ฒœ ์‹œ์Šคํ…œ์˜ ๊ฒฝ์šฐ, ๋Œ€๋ถ€๋ถ„์˜ ์œ ์ €๊ฐ€ ๋ชจ๋“  ์•„์ดํ…œ์„ ํ‰๊ฐ€ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ๋ฐ์ดํ„ฐ๊ฐ€ ๋งค์šฐ Sparseํ•œ ํ™˜๊ฒฝ์ด ์ผ๋ฐ˜์ ์ž…๋‹ˆ๋‹ค. ์ด๋Ÿฌํ•œ ํ™˜๊ฒฝ์—์„œ๋Š” SVM์˜ ๋ณต์žกํ•œ ์ปค๋„ ํŠธ๋ฆญ์ด ์ž˜ ์ž‘๋™ํ•˜์ง€ ์•Š๋Š” ํ•œ๊ณ„๊ฐ€ ์žˆ์Šต๋‹ˆ๋‹ค.

ํ˜‘์—… ํ•„ํ„ฐ๋ง์˜ ์ œํ•œ์ 

์ถ”์ฒœ ์‹œ์Šคํ…œ์„ ๋Œ€ํ‘œํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ค‘ ํ•˜๋‚˜๋Š” ํ˜‘์—… ํ•„ํ„ฐ๋ง์˜ ์ผ์ข…์ธ Latent Factor ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค. Matrix Factorization์€ ๊ฐ€์žฅ ๋Œ€์ค‘์ ์ธ Latent Factor ๋ชจ๋ธ๋กœ, SVD์™€ ์œ ์‚ฌํ•˜๊ฒŒ ์œ ์ €์™€ ์•„์ดํ…œ์„ f์ฐจ์›์˜ ์ž ์žฌ ๊ณต๊ฐ„(latent factor space)์œผ๋กœ ๋งคํ•‘ํ•ฉ๋‹ˆ๋‹ค.

Latent Factor๋Š” User, Item ์‚ฌ์ด์— ์กด์žฌํ•˜๋Š” ํŒจํ„ด์„ ํ†ตํ•ด ์ฐพ์„ ์ˆ˜ ์žˆ์œผ๋ฉฐ, f๊ฐœ์˜ latent factor๋กœ ํ‘œํ˜„๋œ user, item ๋ฒกํ„ฐ์˜ ๋‚ด์ ์œผ๋กœ ๋‘˜ ์‚ฌ์ด์˜ ์ƒํ˜ธ ๊ด€๊ณ„๋ฅผ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


III. Factorization Machine

Factorization Machine(FM)์€ SVM๊ณผ Matrix Factorization์˜ ์žฅ์ ์„ ๊ฒฐํ•ฉํ•œ ๋ชจ๋ธ๋กœ, ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์„ธ ๊ฐ€์ง€ ์ฃผ์š” ํŠน์ง•์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค:

  1. Sparse ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ ๋Šฅ๋ ฅ: SVM์œผ๋กœ ํ•™์Šตํ•˜๊ธฐ ์–ด๋ ค์šด Sparse ํ™˜๊ฒฝ์—์„œ๋„ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ์ถ”์ • ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  2. ์„ ํ˜• ๋ณต์žก๋„(Linear Complexity): ์„ ํ˜• ๋ณต์žก๋„๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ์–ด SVM๊ณผ ๊ฐ™์€ ์Œ๋Œ€๋ฌธ์ œ๋ฅผ ํ’€์ง€ ์•Š์•„๋„ ๋˜๋ฏ€๋กœ ๊ณ„์‚ฐ ํšจ์œจ์„ฑ์ด ๋†’์Šต๋‹ˆ๋‹ค.
  3. ๋ฒ”์šฉ์„ฑ(General Predictor): ์–ด๋– ํ•œ ์‹ค์ˆ˜ ๋ฒกํ„ฐ ํ˜•ํƒœ์˜ ๋ฐ์ดํ„ฐ์—๋„ ์ ์šฉ ๊ฐ€๋Šฅํ•œ General Predictor๋กœ์„œ์˜ ์—ญํ• ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค.

Input ํ‘œํ˜„ ๋ฐฉ์‹

FM์˜ ๊ฐ€์žฅ ํฐ ํŠน์ง•์€ ๋‹ค์–‘ํ•œ Feature๋ฅผ ํ•˜๋‚˜์˜ ๋ฒกํ„ฐ๋กœ ํ†ตํ•ฉ(concatenate)ํ•˜์—ฌ ์‚ฌ์šฉํ•œ๋‹ค๋Š” ์ ์ž…๋‹ˆ๋‹ค. ์ด๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์žฅ์ ์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค:

  • ์–ด๋– ํ•œ ์•”์‹œ์ (implicit) ํŠน์„ฑ์ด๋ผ๋„ ์‹ค์ˆ˜ ํ˜•ํƒœ๋กœ ํ•˜๋‚˜์˜ ํŠน์„ฑ ๋ฒกํ„ฐ์— ์ถ”๊ฐ€ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • ๋ฒ”์ฃผํ˜•(Categorical) ํŠน์„ฑ์€ One-hot ์ธ์ฝ”๋”ฉ ํ˜•ํƒœ๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  • Feature engineering์„ ํ†ตํ•ด SVM๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ๋ฐ์ดํ„ฐ๋ฅผ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

2-way Factorization Machine Model

FM์€ ๊ธฐ๋ณธ์ ์œผ๋กœ ์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์„ ํ™•์žฅํ•œ ํ˜•ํƒœ๋กœ, ๋ณ€์ˆ˜๋“ค ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ(interaction)๊นŒ์ง€ ๋ชจ๋ธ๋งํ•ฉ๋‹ˆ๋‹ค.

์„ ํ˜• ํšŒ๊ท€ ๋ชจ๋ธ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ‘œํ˜„๋ฉ๋‹ˆ๋‹ค:

 

$$\hat{y}(x) = w_0 + \sum_{i=1}^{n} w_i x_i, w_0 \in \mathbb{R}, w \in \mathbb{R}^n$$

 

์ด ๋ชจ๋ธ์˜ ์žฅ์ ์€ ๊ณ„์‚ฐ์ด ๊ฐ„๋‹จํ•˜๊ณ  ๋น ๋ฅด๋‹ค๋Š” ๊ฒƒ์ด์ง€๋งŒ, ๋ณ€์ˆ˜ ๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ์ด ๋ฐ˜์˜๋˜์ง€ ์•Š์•„ ์ •ํ™•ํ•œ ์˜ˆ์ธก์ด ์–ด๋ ต๋‹ค๋Š” ๋‹จ์ ์ด ์žˆ์Šต๋‹ˆ๋‹ค.

FM์€ ์ด ์„ ํ˜• ๋ชจ๋ธ์— ๋ณ€์ˆ˜๋“ค ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ์ถ”๊ฐ€ํ•˜์—ฌ ๋‹ค์Œ๊ณผ ๊ฐ™์ด ํ™•์žฅํ•ฉ๋‹ˆ๋‹ค:

 

$$\hat{y}(x) = w_0 + \sum_{i=1}^{n} w_i x_i + \sum_{i=1}^{n} \sum_{j=i+1}^{n} W_{ij}x_i x_j, w_0 \in \mathbb{R}, w \in \mathbb{R}^n, W \in \mathbb{R}^{n \times n}$$

 

์—ฌ๊ธฐ์„œ $W_{ij}x_i x_j$๋Š” ์ƒํ˜ธ์ž‘์šฉ ํšจ๊ณผ๋ฅผ ํ‘œํ˜„ํ•˜๊ธฐ ์œ„ํ•œ ํ•ญ์œผ๋กœ, ๋ณ€์ˆ˜๋“ค ๊ฐ„์˜ ๊ณฑ์„ ๋ชจ๋ธ์— ์ถ”๊ฐ€ํ•ฉ๋‹ˆ๋‹ค.

FM์˜ ํ˜์‹ ์ ์ธ ์ ์€ ๋‹คํ•ญ ํšŒ๊ท€์™€ ์œ ์‚ฌํ•˜์ง€๋งŒ, ์ƒํ˜ธ์ž‘์šฉ ํ•ญ์˜ ๊ณ„์ˆ˜๋ฅผ ๋‹จ์ˆœํ•œ ๊ฐ’์ด ์•„๋‹Œ ๋ณ€์ˆ˜ ๊ฐ„ ์ž ์žฌ ๋ฒกํ„ฐ(Latent Vector)์˜ ๋‚ด์ ์œผ๋กœ ์‚ฌ์šฉํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค.

ํŒŒ๋ผ๋ฏธํ„ฐ ๋ถ„ํ•ด๋ฅผ ํ†ตํ•œ ํšจ์œจ์„ฑ ์ฆ๋Œ€

FM์€ ๊ฐ ํŒŒ๋ผ๋ฏธํ„ฐ๋งˆ๋‹ค ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ(embedding vector)๋ฅผ ์ƒ์„ฑํ•˜์—ฌ ๋‚ด์  ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•ฉ๋‹ˆ๋‹ค. ์ด ๊ณผ์ •์—์„œ ๊ณ„์‚ฐ๋Ÿ‰์ด ๋Š˜์–ด๋‚  ์ˆ˜ ์žˆ์ง€๋งŒ, ์ˆ˜ํ•™์  ๋ถ„ํ•ด๋ฅผ ํ†ตํ•ด ๊ณ„์‚ฐ ํšจ์œจ์„ฑ์„ ๋†’์ผ ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

$$\hat{y}(\mathbf{x}) := w_0 + \sum_{i=1}^{n} w_i x_i + \sum_{i=1}^{n} \sum_{j=i+1}^{n} \langle\mathbf{v}_i, \mathbf{v}_j\rangle x_i x_j$$

 

์—ฌ๊ธฐ์„œ $\langle\mathbf{v}_i, \mathbf{v}_j\rangle$๋Š” $i$๋ฒˆ์งธ์™€ $j$๋ฒˆ์งธ ํŠน์„ฑ์— ๋Œ€ํ•œ k์ฐจ์› ์ž ์žฌ ๋ฒกํ„ฐ์˜ ๋‚ด์ ์„ ์˜๋ฏธํ•ฉ๋‹ˆ๋‹ค:

$$W_{ij} = \langle\mathbf{v}i, \mathbf{v}j\rangle = \sum{f=1}^{k} v{i,f} v_{j,f}$$

์ด๋Ÿฌํ•œ ๋ถ„ํ•ด ๋ฐฉ์‹์„ ํ†ตํ•ด FM์€ ๊ฐœ๋ณ„ ๋ณ€์ˆ˜์˜ ๊ฐ€์ค‘์น˜๋งŒ์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด ์•„๋‹ˆ๋ผ, factorizing(๋ถ„ํ•ด)์„ ํ†ตํ•ด interaction์„ ํšจ๊ณผ์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•ฉ๋‹ˆ๋‹ค. ์ด๋กœ ์ธํ•ด sparsityํ•œ ์ƒํ™ฉ์—์„œ๋„ high-order interaction์˜ ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ์ถ”์ •ํ•  ์ˆ˜ ์žˆ๊ฒŒ ๋ฉ๋‹ˆ๋‹ค.

d-way Factorization Machine

FM์€ 2๊ฐœ ํŠน์„ฑ ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, d๊ฐœ ํŠน์„ฑ ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ๋„ ๋ชจ๋ธ๋งํ•  ์ˆ˜ ์žˆ๋Š” d-way Factorization Machine์œผ๋กœ ํ™•์žฅ ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. d-way FM ์—ญ์‹œ ์„ ํ˜• ์‹œ๊ฐ„์˜ ๊ณ„์‚ฐ ๋ณต์žก๋„๋ฅผ ์œ ์ง€ํ•˜์—ฌ ํšจ์œจ์ ์ธ ํ•™์Šต์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.


IV. ์‹คํ—˜ ๊ฒฐ๊ณผ ๋ถ„์„

Factorization Machine์˜ ์„ฑ๋Šฅ์„ ํ‰๊ฐ€ํ•˜๊ธฐ ์œ„ํ•ด Netflix ๋ฐ์ดํ„ฐ์…‹๊ณผ ๊ฐ™์€ ๋งค์šฐ sparseํ•œ ํ™˜๊ฒฝ์—์„œ SVM๊ณผ ๋น„๊ต ์‹คํ—˜์„ ์ง„ํ–‰ํ•œ ๊ฒฐ๊ณผ, ํฅ๋ฏธ๋กœ์šด ๋ฐœ๊ฒฌ์ด ์žˆ์—ˆ์Šต๋‹ˆ๋‹ค.

 

 

SVM์€ Sparseํ•œ ํ™˜๊ฒฝ์—์„œ ํ•™์Šต์— ์‹คํŒจํ•œ ๋ฐ˜๋ฉด, FM์€ ์ฐจ์›์ด ์ฆ๊ฐ€ํ•จ์— ๋”ฐ๋ผ ํ•™์Šต์ด ์•ˆ์ •์ ์œผ๋กœ ์ง„ํ–‰๋˜์—ˆ์œผ๋ฉฐ ์˜ˆ์ธก ์˜ค์ฐจ(RMSE)๊ฐ€ ์ ์ง„์ ์œผ๋กœ ๊ฐ์†Œํ•˜๋Š” ์ถ”์„ธ๋ฅผ ๋ณด์˜€์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ์ž ์žฌ ์š”์ธ์˜ ์ฐจ์›(k)์ด 60 ์ด์ƒ์ผ ๋•Œ FM์˜ ์„ฑ๋Šฅ์ด ์ตœ์ ํ™”๋˜๋Š” ๊ฒƒ์œผ๋กœ ๋‚˜ํƒ€๋‚ฌ์Šต๋‹ˆ๋‹ค.


V. ๊ฒฐ๋ก 

Factorization Machine์€ SVM๊ณผ Matrix Factorization์˜ ์žฅ์ ์„ ๊ฒฐํ•ฉํ•œ ํ˜์‹ ์ ์ธ ์ถ”์ฒœ ์‹œ์Šคํ…œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ, ํŠนํžˆ ์‹ํ’ˆ๊ณผ ๊ฐ™์ด ๋‹ค์–‘ํ•œ ํŠน์„ฑ ์ •๋ณด๋ฅผ ๊ณ ๋ คํ•ด์•ผ ํ•˜๋Š” ๋„๋ฉ”์ธ์—์„œ ํฐ ์ž ์žฌ๋ ฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ์Šต๋‹ˆ๋‹ค.

FM์˜ ์ฃผ์š” ๊ฐ•์ ์€ ๋‹ค์Œ๊ณผ ๊ฐ™์Šต๋‹ˆ๋‹ค:

  1. ํŠน์„ฑ ์ •๋ณด์˜ ์œ ์—ฐํ•œ ํ†ตํ•ฉ: ๋‹ค์–‘ํ•œ ์œ ํ˜•์˜ ์‹ํ’ˆ ํŠน์„ฑ ์ •๋ณด๋ฅผ ํ•˜๋‚˜์˜ ๋ชจ๋ธ์— ํ†ตํ•ฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.
  2. ํฌ์†Œ ๋ฐ์ดํ„ฐ ์ฒ˜๋ฆฌ ๋Šฅ๋ ฅ: ๋ชจ๋“  ์‚ฌ์šฉ์ž๊ฐ€ ๋ชจ๋“  ์‹ํ’ˆ์„ ํ‰๊ฐ€ํ•˜์ง€ ์•Š๋Š” ์‹ค์ œ ํ™˜๊ฒฝ์—์„œ ํšจ๊ณผ์ ์œผ๋กœ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค.
  3. ๊ณ„์‚ฐ ํšจ์œจ์„ฑ: ์„ ํ˜• ๋ณต์žก๋„๋ฅผ ์œ ์ง€ํ•˜์—ฌ ๋Œ€๊ทœ๋ชจ ์‹ํ’ˆ ๋ฐ์ดํ„ฐ๋ฒ ์ด์Šค์—์„œ๋„ ํšจ์œจ์ ์œผ๋กœ ํ•™์Šต ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.
  4. ํ™•์žฅ์„ฑ: ๋‹ค์–‘ํ•œ ํŠน์„ฑ ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ชจ๋ธ๋งํ•  ์ˆ˜ ์žˆ์–ด ๋ณต์žกํ•œ ์‹ํ’ˆ ์„ ํ˜ธ๋„ ํŒจํ„ด์„ ํฌ์ฐฉํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

์•ž์œผ๋กœ ๊ฑด๊ฐ•๊ณผ ์›ฐ๋น™์— ๋Œ€ํ•œ ๊ด€์‹ฌ์ด ๋†’์•„์ง€๋Š” ์‹œ๋Œ€์—, Factorization Machine์€ ๊ฐœ์ธ์˜ ๊ฑด๊ฐ• ๋ชฉํ‘œ์™€ ์‹ํ’ˆ ํŠน์„ฑ์„ ๊ณ ๋ คํ•œ ๋งž์ถคํ˜• ์‹ํ’ˆ ์ถ”์ฒœ ์‹œ์Šคํ…œ์„ ๊ตฌ์ถ•ํ•˜๋Š” ๋ฐ ์ค‘์š”ํ•œ ์—ญํ• ์„ ํ•  ๊ฒƒ์œผ๋กœ ๊ธฐ๋Œ€๋ฉ๋‹ˆ๋‹ค.


์ฐธ๊ณ  ๋ฌธํ—Œ:

  • Rendle, S. (2010). Factorization machines. In 2010 IEEE International Conference on Data Mining (pp. 995-1000).
  • Rendle, S. (2012). Factorization machines with libFM. ACM Transactions on Intelligent Systems and Technology (TIST), 3(3), 1-22.