sqlite - Best way to store 100,000+ CSV text files on server? -
sqlite - Best way to store 100,000+ CSV text files on server? -
we have application need store thousands of little csv files. 100,000+ , growing annually same amount. each file contains around 20-80kb of vehicle tracking data. each info set (or file) represents single vehicle journey.
we storing info in sql server, size of database getting little unwieldy , ever need access journey info 1 file @ time (so need query in mass or otherwise store in relational database not needed). performance of database degrading add together more tracks, due time taken rebuild or update indexes when inserting or deleting data.
there 3 options considering:
we utilize filestream feature of sql externalise info files, i've not used feature before. filestream still result in 1 physical file per database object (blob)?
alternatively, store files individually on disk. there end beingness half 1000000 of them after 3+ years. ntfs file scheme cope ok amount?
if lots of files problem, should consider grouping datasets/files little database (one peruser) each user? there lightweight database sqlite can store files?
one farther point: info highly compressible. zipping files reduces them 10% of original size. utilise compression if possible minimise disk space used , backup size.
i have few thoughts, , subjective, mileage ond other readers' mileage may vary, still ball rolling if other folks want set differing points of view...
firstly, have seen performance issues folders containing many files. 1 project got around creating 256 directories, called 00, 01, 02... fd, fe, ff , within each 1 of farther 256 directories same naming convention. potentially divides 500,000 files across 65,536 directories giving few in each - if utilize hash/random generator spread them out. also, filenames pretty short store in database - e.g. 32/af/file-xyz.csv
. doubtless bite head off, sense 10,000 files in 1 directory plenty going on with.
secondly, 100,000 files of 80kb amounts 8gb of info not big these days - little usb flash drive in fact - think arguments compression not valid - storage cheap. of import though, backup. if have 500,000 files have lots of 'inodes' traverse , think statistic used many backup products can traverse 50-100 'inodes' per sec - going waiting long time. depending on downtime can tolerate, may improve take scheme offline , raw, block device - @ 100mb/s can 8gb in 80 seconds , can't imagine traditional, file-based backup can close that. alternatives may filesysten permits snapshots , can backup snapshot. or mirrored filesystem permits split mirror, backup 1 re-create , rejoin mirror.
as said, pretty subjective , sure others have other ideas.
sqlite file storage scalability
Comments
Post a Comment