fossilesque@mander.xyzM to Science Memes@mander.xyzEnglish · 2 days agoPublishers Always Innovatingmander.xyzimagemessage-square38fedilinkarrow-up1681arrow-down13
arrow-up1678arrow-down1imagePublishers Always Innovatingmander.xyzfossilesque@mander.xyzM to Science Memes@mander.xyzEnglish · 2 days agomessage-square38fedilink
minus-squarekeepthepace@slrpnk.netlinkfedilinkEnglisharrow-up2·17 hours agoYes, PDFs are much more permissive and may not have any semantic information at all. Hell, some old publications are just scanned images! PDF -> semantic seems to be a hard problem that basically requires OCR, like these people are doing
minus-squareJackbyDev@programming.devlinkfedilinkEnglisharrow-up2·13 hours agoOh nice, thanks for sharing that project. I haven’t heard of it before!
Yes, PDFs are much more permissive and may not have any semantic information at all. Hell, some old publications are just scanned images!
PDF -> semantic seems to be a hard problem that basically requires OCR, like these people are doing
Oh nice, thanks for sharing that project. I haven’t heard of it before!