๐ŸŽ ์‹ํ’ˆ์—์„œ AI ๊ณต๋ถ€ํ•˜๊ธฐ

DeepFM ๋ณธ๋ฌธ

๐Ÿ’ก๋“ค์–ด๊ฐ€๋ฉฐ

์ถ”์ฒœ ์‹œ์Šคํ…œ์—์„œ ์‚ฌ์šฉ์ž์™€ ์•„์ดํ…œ ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ(Interaction)์„ ํšจ๊ณผ์ ์œผ๋กœ, ํŠนํžˆ ๋‹ค์–‘ํ•œ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ชจ๋‘ ๊ณ ๋ คํ•  ์ˆ˜ ์žˆ๋Š” ๋ชจ๋ธ์ด ํ•„์š”ํ•ฉ๋‹ˆ๋‹ค. DeepFM์€ ์ด๋Ÿฌํ•œ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด 2017๋…„์— ์ œ์•ˆ๋œ ๋ชจ๋ธ๋กœ, ๋‚ฎ์€ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ(Low-order Interaction)๊ณผ ๋†’์€ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ(High-order Interaction)์„ ๋™์‹œ์— ๊ณ ๋ คํ•˜๋ฉด์„œ๋„ ๋ณต์žกํ•œ ํŠน์„ฑ ๊ณตํ•™(Feature engineering)์ด๋‚˜ ์‚ฌ์ „ ํ•™์Šต(Pre-training) ์—†์ด๋„ ํšจ๊ณผ์ ์ธ ์ถ”์ฒœ์ด ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.


I. ์ถ”์ฒœ ์‹œ์Šคํ…œ์—์„œ ์ƒํ˜ธ์ž‘์šฉ์˜ ์ค‘์š”์„ฑ

์ถ”์ฒœ ์‹œ์Šคํ…œ์—์„œ๋Š” ๋ช…์‹œ์ ์œผ๋กœ ๋“œ๋Ÿฌ๋‚˜์ง€ ์•Š์€ ์ž ์žฌ์  ํŠน์„ฑ(Implicit feature) ๊ฐ„์˜ ์ƒํ˜ธ์ž‘์šฉ๊นŒ์ง€ ๊ณ ๋ คํ•˜๋Š” ๊ฒƒ์ด ๋งค์šฐ ์ค‘์š”ํ•ฉ๋‹ˆ๋‹ค. ์ผ์ƒ์—์„œ ๋ณผ ์ˆ˜ ์žˆ๋Š” ์˜ˆ์‹œ๋ฅผ ํ†ตํ•ด ์ด๋ฅผ ์ดํ•ดํ•ด ๋ด…์‹œ๋‹ค:

์‹ค์ƒํ™œ์—์„œ์˜ ์ƒํ˜ธ์ž‘์šฉ ์˜ˆ์‹œ

๐Ÿ• ์‹œ๊ฐ„๊ณผ ์•ฑ ์ข…๋ฅ˜์˜ ์ƒํ˜ธ์ž‘์šฉ(Order-2)

  • ์‹์‚ฌ ์‹œ๊ฐ„์— ์Œ์‹ ๋ฐฐ๋‹ฌ ์•ฑ์„ ์ž์ฃผ ๋‹ค์šด๋กœ๋“œํ•˜๋Š” ํŒจํ„ด
  • ์ด๋•Œ '์‹œ๊ฐ„'๊ณผ '์•ฑ ์ข…๋ฅ˜' ์‚ฌ์ด์˜ 2์ฐจ์› ์ƒํ˜ธ์ž‘์šฉ์ด ์‚ฌ์šฉ์ž์˜ ํด๋ฆญ๋ฅ (CTR)์„ ์˜ˆ์ธกํ•˜๋Š” ์ค‘์š”ํ•œ ์‹ ํ˜ธ๊ฐ€ ๋ฉ๋‹ˆ๋‹ค

๐Ÿ‘จ‍๐Ÿ‘ฆ ์„ฑ๋ณ„, ๋‚˜์ด, ์•ฑ ์ข…๋ฅ˜์˜ ์ƒํ˜ธ์ž‘์šฉ(Order-3)

  • ๋‚จ์ž ์•„์ด๋“ค์ด ์ŠˆํŒ…๊ฒŒ์ž„๊ณผ RPG ๊ฒŒ์ž„ ์•ฑ์„ ์„ ํ˜ธํ•˜๋Š” ํŒจํ„ด
  • '์„ฑ๋ณ„', '๋‚˜์ด', '์•ฑ ์ข…๋ฅ˜' ์‚ฌ์ด์˜ 3์ฐจ์› ์ƒํ˜ธ์ž‘์šฉ์ด ํด๋ฆญ๋ฅ ์— ์˜ํ–ฅ์„ ๋ฏธ์นฉ๋‹ˆ๋‹ค

์ด์ฒ˜๋Ÿผ ์–‘์งˆ์˜ ์ถ”์ฒœ์„ ์œ„ํ•ด์„œ๋Š” ๋‚ฎ์€ ์ฐจ์›๋ถ€ํ„ฐ ๋†’์€ ์ฐจ์›๊นŒ์ง€์˜ ๋‹ค์–‘ํ•œ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ชจ๋‘ ๊ณ ๋ คํ•ด์•ผ ํ•ฉ๋‹ˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ๋ชจ๋“  ์ƒํ˜ธ์ž‘์šฉ์„ ์ธ์œ„์ ์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๊ธฐ๋Š” ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค.

๋ฐ์ดํ„ฐ์— ์ˆจ๊ฒจ์ง„ ์ƒํ˜ธ์ž‘์šฉ

์•ž์„œ ์˜ˆ์‹œ์ฒ˜๋Ÿผ ๋ถ„๋ช…ํ•˜๊ฒŒ ๋“œ๋Ÿฌ๋‚˜๋Š” ์ƒํ˜ธ์ž‘์šฉ๋„ ์žˆ์ง€๋งŒ, ๋Œ€๋ถ€๋ถ„์˜ ์ƒํ˜ธ์ž‘์šฉ์€ ๋ฐ์ดํ„ฐ ์†์— ์ˆจ๊ฒจ์ ธ ์žˆ์Šต๋‹ˆ๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด:

๐Ÿ›’ "๊ธฐ์ €๊ท€๋ฅผ ๊ตฌ๋งคํ•˜๋Š” ์‚ฌ๋žŒ์€ ๋งฅ์ฃผ๋ฅผ ํ•จ๊ป˜ ๊ตฌ๋งคํ•œ๋‹ค"๋ผ๋Š” ์œ ๋ช…ํ•œ ๋งˆ์ผ€ํŒ… ์‚ฌ๋ก€๋Š” ์ง๊ด€์ ์œผ๋กœ ์˜ˆ์ƒํ•˜๊ธฐ ์–ด๋ ค์šด 2์ฐจ์› ์ƒํ˜ธ์ž‘์šฉ์ž…๋‹ˆ๋‹ค.

๋˜ํ•œ, ํŠน์„ฑ ๊ฐ„ ์ƒํ˜ธ์ž‘์šฉ์€ ๊ธฐํ•˜๊ธ‰์ˆ˜์ ์œผ๋กœ ์ฆ๊ฐ€ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ๋ชจ๋“  ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ์ˆ˜์ž‘์—…์œผ๋กœ ๋ชจ๋ธ๋งํ•˜๋Š” ๊ฒƒ์€ ํ˜„์‹ค์ ์œผ๋กœ ๋ถˆ๊ฐ€๋Šฅํ•ฉ๋‹ˆ๋‹ค. ๋”ฐ๋ผ์„œ ๋จธ์‹ ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‹ค์–‘ํ•œ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ์ž๋™์œผ๋กœ ํฌ์ฐฉํ•  ์ˆ˜ ์žˆ์–ด์•ผ ํ•ฉ๋‹ˆ๋‹ค.

๊ธฐ์กด ๋ชจ๋ธ์˜ ํ•œ๊ณ„

Factorization Machine (FM)

  • Order-2๋ถ€ํ„ฐ Order-n๊นŒ์ง€์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋‚ด์ ์„ ํ†ตํ•ด ๋ชจ๋ธ๋ง ๊ฐ€๋Šฅ
  • ์ด๋ก ์ƒ์œผ๋กœ๋Š” ์„ ํ˜•์  ๋ณต์žก๋„๋ฅผ ๊ฐ€์ง€์ง€๋งŒ, ์‹ค์ œ๋กœ๋Š” ๋†’์€ ๊ณ„์‚ฐ ๋ณต์žก๋„๋กœ ์ธํ•ด ์ฃผ๋กœ Order-2 ๋ชจ๋ธ๋ง๋งŒ ์‚ฌ์šฉ๋จ

Wide & Deep

  • Linear("wide") ๋ชจ๋ธ๊ณผ neural network("deep") ๋ชจ๋ธ์„ ๊ฒฐํ•ฉํ•œ ๊ตฌ์กฐ
  • ๋‚ฎ์€ ์ฐจ์›๊ณผ ๋†’์€ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ชจ๋‘ ๋ชจ๋ธ๋งํ•  ์ˆ˜ ์žˆ์Œ
  • ๊ทธ๋Ÿฌ๋‚˜ ๋†’์€ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ชจ๋ธ๋งํ•˜๊ธฐ ์œ„ํ•ด ์ง์ ‘ ์„ ์ •ํ•œ ํŠน์„ฑ์„ cross-product transformationํ•˜์—ฌ ์ƒˆ๋กœ์šด ํŠน์„ฑ์œผ๋กœ ์ถ”๊ฐ€ํ•ด์•ผ ํ•จ
  • ์ „๋ฌธ์ ์ธ ํŠน์„ฑ ๊ณตํ•™ ์ž‘์—…์ด ํ•„์š”ํ•จ

II. DeepFM: ๋‘ ์„ธ๊ณ„์˜ ์žฅ์ ์„ ๊ฒฐํ•ฉํ•œ ๋ชจ๋ธ

DeepFM์€ ์•ž์„œ ์–ธ๊ธ‰ํ•œ ๋ฌธ์ œ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์„ค๊ณ„๋œ ๋ชจ๋ธ๋กœ, ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์ฃผ์š” ํŠน์ง•์„ ๊ฐ€์ง‘๋‹ˆ๋‹ค:

์ฃผ์š” ํŠน์ง•

  • ๐Ÿ“Š End-to-End๋กœ ๋ชจ๋“  ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋ชจ๋ธ๋งํ•  ์ˆ˜ ์žˆ์Œ
  • ๐Ÿงฐ ์ „๋ฌธ์ ์ธ ํŠน์„ฑ ๊ณตํ•™ ์ž‘์—…์ด ํ•„์š”ํ•˜์ง€ ์•Š์Œ
  • ๐Ÿ”„ FM ์ปดํฌ๋„ŒํŠธ์™€ Deep ์ปดํฌ๋„ŒํŠธ๊ฐ€ ๋™์ผํ•œ ์ž…๋ ฅ๊ณผ ์ž„๋ฒ ๋”ฉ์„ ๊ณต์œ ํ•จ

๋ชจ๋ธ ๊ตฌ์กฐ

DeepFM์€ ๋‘ ๊ฐ€์ง€ ํ•ต์‹ฌ ์ปดํฌ๋„ŒํŠธ๋กœ ๊ตฌ์„ฑ๋ฉ๋‹ˆ๋‹ค:

  1. FM ์ปดํฌ๋„ŒํŠธ (Low-order): Factorization Machine์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•จ
  2. Deep ์ปดํฌ๋„ŒํŠธ (High-order): ์‹ฌ์ธต ์‹ ๊ฒฝ๋ง์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•จ

์ด ๋‘ ์ปดํฌ๋„ŒํŠธ๋Š” ๊ฐ™์€ ์ž…๋ ฅ๊ณผ ์ž„๋ฒ ๋”ฉ ๋ ˆ์ด์–ด๋ฅผ ๊ณต์œ ํ•˜๋ฉฐ, ์ด๋Š” Wide & Deep ๋ชจ๋ธ๊ณผ์˜ ์ค‘์š”ํ•œ ์ฐจ๋ณ„์ ์ž…๋‹ˆ๋‹ค.

1. FM ์ปดํฌ๋„ŒํŠธ

FM ์ปดํฌ๋„ŒํŠธ๋Š” ๊ธฐ์กด์˜ Factorization Machine๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค:

  • 1์ฐจ์› ์ƒํ˜ธ์ž‘์šฉ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ, ์ž ์žฌ ํŠน์„ฑ์˜ ๋‚ด์ ์„ ์‚ฌ์šฉํ•˜์—ฌ 2์ฐจ์› ์ƒํ˜ธ์ž‘์šฉ๋„ ๋ชจ๋ธ๋ง ๊ฐ€๋Šฅ
  • ๋‹ค์Œ ์ˆ˜์‹์œผ๋กœ ๊ณ„์‚ฐ๋ฉ๋‹ˆ๋‹ค:

$$y_{FM} = <w, x> + \sum_{j_1=1}^{d}\sum_{j_2=j_1+1}^{d}<v_{j_1}, v_{j_2}>x_{j_1}x_{j_2}$$

์—ฌ๊ธฐ์„œ $w \in \mathbb{R}^d$, $V_i \in \mathbb{R}^k$์ž…๋‹ˆ๋‹ค.

2. Deep ์ปดํฌ๋„ŒํŠธ

Deep ์ปดํฌ๋„ŒํŠธ๋Š” ํ”ผ๋“œํฌ์›Œ๋“œ ์‹ ๊ฒฝ๋ง์„ ์‚ฌ์šฉํ•˜์—ฌ ๋†’์€ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ํ•™์Šตํ•ฉ๋‹ˆ๋‹ค:

  • ๋ฒ”์ฃผํ˜• ํŠน์„ฑ๊ณผ ์—ฐ์†ํ˜• ํŠน์„ฑ์ด ํ˜ผ์žฌ๋œ ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ์ฒ˜๋ฆฌ
  • ๋ชจ๋“  ํŠน์„ฑ์˜ ์ €์ฐจ์› ์ž„๋ฒ ๋”ฉ์„ ์‹ ๊ฒฝ๋ง์˜ ์ž…๋ ฅ์œผ๋กœ ํ™œ์šฉ
  • ๋‹ค์–‘ํ•œ ๊ธธ์ด์˜ ํŠน์„ฑ ๋ฒกํ„ฐ๋ฅผ ๋™์ผํ•œ ํฌ๊ธฐ์˜ ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜

ํŠนํžˆ ์ค‘์š”ํ•œ ์ ์€ FM ์ปดํฌ๋„ŒํŠธ์—์„œ 2์ฐจ์› ์ƒํ˜ธ์ž‘์šฉ์„ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•œ ์ž ์žฌ ํŠน์„ฑ(latent feature)์œผ๋กœ ์‚ฌ์šฉ๋˜๋Š” ๊ฐ€์ค‘์น˜ V๊ฐ€ Deep ์ปดํฌ๋„ŒํŠธ์—์„œ๋Š” ์ž„๋ฒ ๋”ฉ ๋ ˆ์ด์–ด์˜ ๊ฐ€์ค‘์น˜ ์—ญํ• ์„ ํ•œ๋‹ค๋Š” ๊ฒƒ์ž…๋‹ˆ๋‹ค. ์ด๋ฅผ ํ†ตํ•ด ๋‘ ์ปดํฌ๋„ŒํŠธ๊ฐ€ ๋™์ผํ•œ ์ž„๋ฒ ๋”ฉ ๋ ˆ์ด์–ด๋ฅผ ํšจ๊ณผ์ ์œผ๋กœ ๊ณต์œ ํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.


III. DeepFM์˜ ํšจ์œจ์„ฑ๊ณผ ํšจ๊ณผ์„ฑ

ํšจ์œจ์„ฑ ์ธก๋ฉด

DeepFM์€ ์ฒ˜๋ฆฌ ์†๋„ ๋ฉด์—์„œ๋„ ๋‹ค๋ฅธ ๋ชจ๋ธ๋“ค๊ณผ ๋น„๊ตํ–ˆ์„ ๋•Œ ๊ฒฝ์Ÿ๋ ฅ์ด ์žˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ์‚ฌ์ „ ํ•™์Šต ๊ณผ์ •์ด ํ•„์š”ํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์—, ์ „์ฒด ํ•™์Šต ์‹œ๊ฐ„์ด ๋Œ€ํญ ๋‹จ์ถ•๋ฉ๋‹ˆ๋‹ค.

ํšจ๊ณผ์„ฑ ์ธก๋ฉด

CTR(Click-Through Rate) ์˜ˆ์ธก์— ์žˆ์–ด์„œ DeepFM์€ ๋‹ค๋ฅธ ๋ชจ๋ธ๋“ค๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค:

  • ์ƒํ˜ธ์ž‘์šฉ์„ ๊ณ ๋ คํ•˜์ง€ ์•Š๋Š” LR(Logistic Regression) ๋ชจ๋ธ์ด ๋‹ค๋ฅธ ๋ชจ๋ธ๋“ค์— ๋น„ํ•ด ์„ฑ๋Šฅ์ด ๋‚ฎ์Œ → ์ƒํ˜ธ์ž‘์šฉ ๊ณ ๋ ค์˜ ์ค‘์š”์„ฑ ์ž…์ฆ
  • ๋†’์€ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๊ณ ๋ คํ•˜๋Š” ๋ชจ๋ธ๋“ค์ด ๋‚ฎ์€ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ๋งŒ ๊ณ ๋ คํ•˜๋Š” ๋ชจ๋ธ๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ž„
  • ์ž„๋ฒ ๋”ฉ์„ ๊ณต์œ ํ•˜๋Š” DeepFM์ด AUC์™€ LogLoss ์ธก๋ฉด์—์„œ ๊ฐ€์žฅ ์ข‹์€ ์„ฑ๋Šฅ์„ ๊ธฐ๋ก

์ด๋Ÿฌํ•œ ๊ฒฐ๊ณผ๋Š” ์ถ”์ฒœ ์‹œ์Šคํ…œ์—์„œ ๋‹ค์–‘ํ•œ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๊ณ ๋ คํ•˜๋Š” ๊ฒƒ์˜ ์ค‘์š”์„ฑ๊ณผ DeepFM ๋ชจ๋ธ์˜ ํšจ๊ณผ์„ฑ์„ ๋ช…ํ™•ํ•˜๊ฒŒ ๋ณด์—ฌ์ค๋‹ˆ๋‹ค.


IV. ๊ฒฐ๋ก  ๐ŸŽฏ

DeepFM์€ ์ถ”์ฒœ ์‹œ์Šคํ…œ์—์„œ ๋‚ฎ์€ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ๊ณผ ๋†’์€ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ ๋™์‹œ์— ๋ชจ๋ธ๋งํ•  ์ˆ˜ ์žˆ๋Š” ํšจ๊ณผ์ ์ธ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ํŠนํžˆ ๋‹ค์Œ๊ณผ ๊ฐ™์€ ์žฅ์ ์„ ๊ฐ€์ง‘๋‹ˆ๋‹ค:

  • ๋ณต์žกํ•œ ํŠน์„ฑ ๊ณตํ•™ ์ž‘์—… ์—†์ด๋„ ๋‹ค์–‘ํ•œ ์ฐจ์›์˜ ์ƒํ˜ธ์ž‘์šฉ์„ End-to-End๋กœ ํ•™์Šต ๊ฐ€๋Šฅ
  • FM ์ปดํฌ๋„ŒํŠธ์™€ Deep ์ปดํฌ๋„ŒํŠธ๊ฐ€ ์ž„๋ฒ ๋”ฉ์„ ๊ณต์œ ํ•จ์œผ๋กœ์จ ๋ชจ๋ธ์˜ ํ‘œํ˜„๋ ฅ ํ–ฅ์ƒ
  • ๋‹ค์–‘ํ•œ ์ถ”์ฒœ ์‹œ๋‚˜๋ฆฌ์˜ค์—์„œ ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ž„

์ถ”์ฒœ ์‹œ์Šคํ…œ์„ ๊ฐœ๋ฐœํ•˜๊ฑฐ๋‚˜ ๊ฐœ์„ ํ•˜๋ ค๋Š” ๊ฒฝ์šฐ, DeepFM์€ ๊ธฐ์กด ๋ชจ๋ธ์˜ ํ•œ๊ณ„๋ฅผ ๋›ฐ์–ด๋„˜๋Š” ํšจ๊ณผ์ ์ธ ๋Œ€์•ˆ์ด ๋  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค. ํŠนํžˆ ์‚ฌ์šฉ์ž์™€ ์•„์ดํ…œ ๊ฐ„์˜ ๋ณต์žกํ•œ ์ƒํ˜ธ์ž‘์šฉ์„ ํฌ์ฐฉํ•˜์—ฌ ๋ณด๋‹ค ์ •ํ™•ํ•œ ์ถ”์ฒœ์„ ์ œ๊ณตํ•˜๊ณ ์ž ํ•˜๋Š” ๊ฒฝ์šฐ์— ์ ํ•ฉํ•ฉ๋‹ˆ๋‹ค.


์ฐธ๊ณ  ๋ฌธํ—Œ:

  • Guo, H., Tang, R., Ye, Y., Li, Z., & He, X. (2017). DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. arXiv preprint arXiv:1703.04247.
  • Cheng, H. T., Koc, L., Harmsen, J., Shaked, T., Chandra, T., Aradhye, H., ... & Shah, H. (2016). Wide & deep learning for recommender systems. In Proceedings of the 1st workshop on deep learning for recommender systems.
  • Rendle, S. (2010). Factorization machines. In 2010 IEEE International Conference on Data Mining.