Context Navigation

← Previous Revision
Latest Revision
Next Revision →
Blame
Revision Log

0433373_rickvanderzwet.txt@ 189

Last change on this file since 189 was 2, checked in by Rick van der Zwet, 15 years ago
Initial import of data of old repository ('data') worth keeping (e.g. tracking means of URL access statistics)
File size: 3.5 KB

Line
1	Papers read:
2	- J. Gray, The Next Database Revolution. SIGMOD 2004, pp. 13-18, June 2004.
3	- Hans-Peter Kriegel, et al. Future trends in data mining. Data Mining and
4	Knowledge Discovery [1384-5810] 2007 vol:15 iss:1 pg:87
5
6	Current database limits [Gray, 2004] and the current limits in data-mining
7	[Kriegel, 2007] technology does not solely focus on the technological barriers of
8	current datasets, with regards to memory, latency and storage. But also focus
9	on a way on how to process this data efficiently. [Gray, 2004] talks about the
10	use of interfaces to connect databases to the clients e.g. providing direct
11	interfaces to the clients using SOAP calls for example. But the use of
12	distributed databases how not gone mentioned. But first things first:
13
14	= Data-mining & Usability =
15
16	Data-mining nowadays focus on subset solutions, with less attempt to generalize
17	the effort for mass-use. The tools and methods provided and used are merely the
18	building blocks for algorithms with focus on subset solutions with a well known
19	datasets or a lot of sanitized and known meta-data.
20
21	With the ever increasing amount of data gathered and stored, generated human
22	understandable results (if any result at all) becomes harder and harder. The
23	underlying technique for generating results is often not to be explained by
24	logic human reasoning. Making the results hard to justify or even explain,
25	leaving potential good algorithms and strategies unused.
26
27	(Near) future should show us whether we are capable of extracting results which
28	are of added value to understanding the process instead of showing heuristics,
29	allowing us to reason further about what is going on inside an process.
30
31
32	= Memory based databases with a file based backend =
33
34	Reducing and elimination latency to the database objects on specific media has
35	been always been a major focus within the design of algorithms of database query
36	automation. Recent technology inventions and improvements has lead to
37	developments allowing us to run any average small size database fully into the
38	memory system. Hence reducing access to every object within the database to a
39	equal level, making the latency decisions in algorithms obsolete, clearing the
40	path for a new type of algorithm design focusing of spanning the whole data-set
41	as fast possible.
42
43	Together with a full-memory database, comes the process of designing the
44	database in such way that it can be mirrored on persistent media for obvious
45	reasons (power failure, transport, backup, revisions). Instead of taking the
46	traditional block level disk access approach new disks comes with ability to do
47	clever queuing and latency reducing actions of file based objects. Future will
48	show whether block based access (memory database) with a file based storage
49	will be one of the possibles and how to cope best with large databases sets.
50
51	= Distributed databases =
52
53	One area not covered by [Kriegel,2007] and [Gray,2004] it the development of
54	several Peta-bytes datasets (like the genome databases) that needed to be
55	accessed by many concurrent clients trough out the world, so link-layer latencies
56	comes in the picture.
57
58	Finding ways of enabling this datasets for all clients at an acceptable/uniform
59	access time it something getting a major importance in the future as datasets
60	are rapidly growing due to the development of new sensors and image/video based
61	storage and more of those datasets have a heavily shared nature as more
62	research and business will be gathering and sharing from multiple (geographical)
63	locations, but are in need of centralized query interfaces.
64

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: liacs/dbdm/dbdm_1/0433373_rickvanderzwet.txt@ 189

Download in other formats: