Is there a general purpose data structure which combines a multi-node tree with a hashtable?
-
I've been making an addon to a program (using DotNet) which works on a key string with a description. These are grouped under categories and sub-categories. So a multi-node tree makes sense. But finding a particular node in the tree structure is rather slow. Especially since the number of individual nodes can easily run to 1000's or even 100,000's Another similar program uses a database backend for this purpose. But when I tried it the startup time was enormous (since the needed output is a flat text file). But worse, even just editing anything took long (several seconds before an edit is saved). Unfortunately its closed source so I can't check if it's simply doing something stupid. At first I tried a similar approach through an Access db with indexing set to the Key as well as a secondary index on the parent field. Seemed a "bit" faster, but not by much. Anyhow, I've started turning my multi-node tree into a hybrid hashtable + tree. I.e. an in-memory equivalent of the relational DB. From preliminary tests it seems extremely performant in relation to using a database. Though now the memory use has obviously jumped, where the original was using 3MB RAM, I now see between 60 and 80MB on a node list of 6000 items. When I was working on this, I thought: "Why isn't there a standard library like this already?" The closest I could find is a DataTable (in RAM), but when testing it was using even more RAM and wasn't that much faster than the Access linked DataTable. To explain this more: The data file is a tab-delimited text file in the form: <Key>\t<Value>\t<Parent Key> And it might not be in order, so the child nodes can be either before or after their parent. The main reason I'm using a hashtable. Sample implementation: public class KeyNote { string _key, _note, _parentKey; KeyNoteList _children = null; Dictionary<string, KeyNote> _search; public KeyNote(Dictionary<string, KeyNote> search, string key, string note, string parent = "") { this._search = search; this._key = key; this._note = note; this._parentKey = parent; } public KeyNote(Dictionary<string, KeyNote> search, string[] values) : this(search, (values.Length > 0) ? values[0] : string.Empty, (values.Length > 1) ? values[1] : string.Empty, (values.Length > 2) ? values[2] : string.Empty ) {} public bool HasChildren { get { return ((_children != null) && (_children.Count > 0)); } } public KeyNoteList Children { get { if (_children == null) { _children = new KeyNoteList(); } return _children; } } public string Key { get { return _key; } set { if (_search.ContainsKey(value)) { throw new IndexOutOfRangeException("Duplicate key"); } _search.Remove(_key); _key = value; _search.Add(_key, this); } } public string Note { get { return _note; } set { _note = value; } } public string ParentKey { get { return _parentKey; } set { _parentKey = value; } } public KeyNote Parent { get { if (_parentKey == string.Empty) { return null; } KeyNote parent; if (_search.TryGetValue(_parentKey, out parent)) { return parent; } throw new KeyNotFoundException(); } set { KeyNote parent; if ((_parentKey != string.Empty) && _search.TryGetValue(_parentKey, out parent) && parent.HasChildren) { parent._children.Remove(this); } _parentKey = value.Key; if (!value.Children.Contains(this)) { value._children.Add(this); } } } public void WriteToTextStream(TextWriter writer) { writer.Write(_key); writer.Write(KeyNoteContainer.SPLIT_CHARS[0]); writer.Write(_note); writer.Write(KeyNoteContainer.SPLIT_CHARS[0]); writer.WriteLine(_parentKey); if (HasChildren) { foreach (KeyNote child in _children) { child.WriteToTextStream(writer); } } } } public class KeyNoteList : List<KeyNote> { public KeyNoteList() : base() { } } public class KeyNoteContainer { Dictionary<string, KeyNote> _search; KeyNoteList _roots, _duplicates, _orphans; public KeyNoteContainer() { _roots = new KeyNoteList(); _search = new Dictionary<string, KeyNote>(StringComparer.Ordinal); _duplicates = new KeyNoteList(); _orphans = new KeyNoteList(); } internal static char[] SPLIT_CHARS = "\t".ToCharArray(); public void ReadFromTextStream(TextReader reader) { string line; while (line = reader.ReadLine()) { string[] fields = line.Split(SPLIT_CHARS); KeyNote keyNote = new KeyNote(_search, fields); if (_search.ContainsKey(keyNote.Key)) { _duplicates.Add(keyNote); } if (keyNote.ParentKey == string.Empty) { _roots.Add(keyNote); } else { KeyNote parent; if (_search.TryGetValue(keyNote.ParentKey, out parent)) { keyNote.Parent = parent; } else { _orphans.Add(keyNote); } _search.Add(keyNote.Key, keyNote); } } foreach (KeyNote keyNote in _orphans) { KeyNote parent; if (_search.TryGetValue(keyNote.ParentKey, out parent)) { keyNote.Parent = parent; _orphans.Remove(keyNote); } } } public void WriteToTextStream(TextWriter writer) { foreach (KeyNote root in _roots) { root.WriteToTextStream(writer); } } } Note this is just implementing the bare minimum to indicate the idea. The trouble is that it's sooo-much extras which are cluttering up the business model. I'm in the process of attempting this through extension methods instead - which would mean the business model class(es) would only need to implement an interface to obtain the key/parentKey values.
-
Answer:
Did you try the Dictionary datatype used recursively? That should be fast enough. You can also look at pouchdb.
Jonathan Jaffe at Quora Visit the source
Other answers
Subject to a pending patent in my case (not disclosable til later). I also complete disagree with I can think of *MANY* cases where you would want the equivelent of O(1)O(1)O(1) time/space such a structure could offer (in reality O(1) space and O(logkn)O(logkn)O(log_k n) with k>=256 in most cases. They do not beat b-tree's in complexity but they are much more flexible especially in distrubuted computing.
Aryeh Friedman
Related Q & A:
- What programming languages are 'general purpose' and 'domain specific?Best solution by Programmers
- What is a lymph node in a lung?Best solution by answers.yahoo.com
- How does the structure of a chloroplast enable it to build up a concentration gradient of protons?Best solution by Yahoo! Answers
- Can I use a Logitech multi media speaker to a 50 inch TV?Best solution by Yahoo! Answers
- Which Is a better place to rent a villa?Best solution by Yahoo! Answers
Just Added Q & A:
- How many active mobile subscribers are there in China?Best solution by Quora
- How to find the right vacation?Best solution by bookit.com
- How To Make Your Own Primer?Best solution by thekrazycouponlady.com
- How do you get the domain & range?Best solution by ChaCha
- How do you open pop up blockers?Best solution by Yahoo! Answers
For every problem there is a solution! Proved by Solucija.
-
Got an issue and looking for advice?
-
Ask Solucija to search every corner of the Web for help.
-
Get workable solutions and helpful tips in a moment.
Just ask Solucija about an issue you face and immediately get a list of ready solutions, answers and tips from other Internet users. We always provide the most suitable and complete answer to your question at the top, along with a few good alternatives below.