org.omegahat.DataStructures.Data
Class DataFrame

java.lang.Object
  |
  +--java.util.Dictionary
        |
        +--java.util.Hashtable
              |
              +--org.omegahat.Environment.Utils.OrderedTable
                    |
                    +--org.omegahat.DataStructures.Data.DataFrame
All Implemented Interfaces:
Activable, AssignableSubset, BasicFrameInt, java.lang.Cloneable, Database, DataFrameInt, java.util.Map, RecordStreamListener, java.io.Serializable, ShallowCopyable, Subsettable
Direct Known Subclasses:
OrderedDataFrame

public class DataFrame
extends OrderedTable
implements java.io.Serializable, DataFrameInt, RecordStreamListener

Represents a collection of VariableInt objects as a named group for accessing them individually or in sub-groups, indexable by name, and reducable/subsettable by observation index or observation name.

See Also:
Serialized Form

Inner classes inherited from class java.util.Map
java.util.Map.Entry
 
Field Summary
protected  java.util.Hashtable metaData
           
protected  boolean readHeader
           
protected  java.lang.String[] rowNames
          An ordered collection of identifiers for the different records in the dataset.
static VariableInt UnknownVariable
           
 
Fields inherited from class org.omegahat.Environment.Utils.OrderedTable
listeners, name, orderedElements, orderedKeys, state
 
Fields inherited from interface org.omegahat.Environment.Databases.Database
ALL, ASSIGN, ATTACH, DETACH, NULL_ENTRY, READ, READ_WRITE, REMOVE
 
Fields inherited from interface org.omegahat.Environment.Databases.Activable
ACTIVE, INACTIVE
 
Constructor Summary
DataFrame()
          Constructor that provides no information about the number or types of variables or the number of observations.
DataFrame(EventDataScanner scanner)
          Reads the contents of the data from the specified scanner by being notified when a complete record has been consumed by the scanner.
DataFrame(EventDataScanner scanner, boolean readHeader)
           
DataFrame(java.io.File file)
           
DataFrame(java.io.File file, boolean readHeader)
           
DataFrame(java.io.InputStream stream)
          Reads the values from the specified stream by converting it to a reader and then creating a EventDataScanner.
DataFrame(java.io.InputStream stream, boolean readHeader)
           
DataFrame(int l)
          Constructor specifying the number of records that will be in this dataset.
DataFrame(int l, Database dbase)
          Constructor specifying the number of records that will be in this dataset and a database containing variables to which should be added to this dataset.
DataFrame(java.lang.Integer l)
          Constructor specifying the number of records that will be in this dataset.
DataFrame(java.io.Reader reader)
          Reads the values from the specified connection/stream.
DataFrame(java.io.Reader reader, boolean readHeader)
           
 
Method Summary
 int add(java.util.Collection col)
          Treat the entries in the collection as components of a single record and append them to the corresponding variables, using the default ordering of the variable names
 int add(java.util.Iterator els)
          Process each element in the iterator as an element in a record by adding it to the variable identified by the corresponding variable identified using the default names.
 int add(java.util.Iterator els, java.lang.String[] names)
          Process each element in the iterator by adding it to the variable identified by the corresponding element in name.
 void add(java.lang.String name)
           
 void add(java.lang.String name, VariableInt v)
           
 VariableInt addVariable(java.lang.String name, VariableInt var)
          Extract the specified variable identified by name.
 VariableInt createDefaultVariable(java.lang.String name)
           
 VariableInt createDefaultVariable(java.lang.String name, int num)
           
 VariableInt createVariable(int which)
           
protected  VariableInt createVariable(java.lang.Object obj)
          This creates a VariableInt object of the appropriate class based on the type of the argument obj.
protected  int createVariables(int howMany)
          Create generic RealVariable objects and give each the name Vari where i runs from 1 to howMany
 java.lang.Object[] get(int obsNo)
           
 DataFrame get(int[] obsNos)
           
 DataFrame get(int[] obsNos, java.lang.String[] name)
           
 DataFrame get(java.lang.String[] name)
           
 boolean getReadHeader()
          Accessor for readHeader field
 VariableInt getVariable(java.lang.String name)
          Extract the specified variable identified by name.
 java.lang.String[] getVariableNames()
          Return a (ordered) list of the names of the variables contained in this collection.
 java.util.Hashtable metaData()
          Accessor for metaData field
 java.util.Hashtable metaData(java.util.Hashtable value)
          Accessor for setting metaData field
 boolean newRecord(java.lang.Object value, ObjectReader source)
          Notification event called by a EventDataScanner when it has read a complete record.
 long numObservations()
          Return the number of observations available in each of the variables.
 long numVariables()
          Return the number of variables contained in this frame.
 long read(EventDataScanner scanner, boolean readHeader)
          Reads the records/observations for the data frame one at a time by being notified by the scanner.
 java.lang.String rowName(java.lang.String value, int which)
           
 java.lang.String[] rowNames()
          Returns an array of the identifiers of the row names.
 java.lang.String[] rowNames(java.io.InputStream stream)
           
 java.lang.String[] rowNames(java.lang.String[] value)
           
 java.lang.String[] rowNames(java.util.Vector v)
           
 boolean setNames(java.util.Collection els)
           
 boolean setReadHeader(boolean value)
          Accessor for setting readHeader field
 BasicFrameInt subsetObservations(int[] whichRows)
          Create a new data frame consisting of the rows identified in the array passed to this method.
 BasicFrameInt subsetObservations(int start, int which)
          Get the subset of the data frame of the continguous rows starting at start and ending at which, inclusive.
 java.lang.String toString()
           
 
Methods inherited from class org.omegahat.Environment.Utils.OrderedTable
add, add, addListListener, assign, assignSubset, attach, clear, copy, copy, detach, elementAt, exists, get, getName, getState, keys, listeners, listeners, notifyListeners, objects, ordered, orderedKeys, put, put, put, remove, remove, removeElement, removeElement, removeElementAt, removeElementAt, removeListListener, setElementAt, setName, setState, subset, subset, subset
 
Methods inherited from class java.util.Hashtable
clone, contains, containsKey, containsValue, elements, entrySet, equals, get, hashCode, isEmpty, keySet, putAll, rehash, size, values
 
Methods inherited from class java.lang.Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface org.omegahat.Environment.Databases.Database
assign, attach, clear, detach, exists, get, getName, objects, remove, setName, size
 
Methods inherited from interface org.omegahat.Environment.Databases.Activable
getState, setState
 

Field Detail

UnknownVariable

public static VariableInt UnknownVariable

rowNames

protected java.lang.String[] rowNames
An ordered collection of identifiers for the different records in the dataset.

readHeader

protected boolean readHeader

metaData

protected java.util.Hashtable metaData
Constructor Detail

DataFrame

public DataFrame()
Constructor that provides no information about the number or types of variables or the number of observations.

DataFrame

public DataFrame(java.lang.Integer l)
Constructor specifying the number of records that will be in this dataset.

DataFrame

public DataFrame(int l)
Constructor specifying the number of records that will be in this dataset.

DataFrame

public DataFrame(int l,
                 Database dbase)
Constructor specifying the number of records that will be in this dataset and a database containing variables to which should be added to this dataset. Each element in the database is checked to see if it is a variable and whether its length matches the specified value.

DataFrame

public DataFrame(EventDataScanner scanner)
Reads the contents of the data from the specified scanner by being notified when a complete record has been consumed by the scanner.

DataFrame

public DataFrame(EventDataScanner scanner,
                 boolean readHeader)

DataFrame

public DataFrame(java.io.Reader reader,
                 boolean readHeader)

DataFrame

public DataFrame(java.io.Reader reader)
Reads the values from the specified connection/stream.
See Also:
#read(org.omegahat.Environment.Tools.EventDataScanner,boolean)

DataFrame

public DataFrame(java.io.InputStream stream)
Reads the values from the specified stream by converting it to a reader and then creating a EventDataScanner.
See Also:
read(org.omegahat.Environment.Tools.DataScanner.EventDataScanner,boolean)

DataFrame

public DataFrame(java.io.InputStream stream,
                 boolean readHeader)
See Also:
#read(org.omegahat.Environment.Tools.EventDataScanner,boolean)

DataFrame

public DataFrame(java.io.File file,
                 boolean readHeader)
          throws java.io.FileNotFoundException

DataFrame

public DataFrame(java.io.File file)
          throws java.io.FileNotFoundException
Method Detail

getVariableNames

public java.lang.String[] getVariableNames()
Description copied from interface: BasicFrameInt
Return a (ordered) list of the names of the variables contained in this collection.
Specified by:
getVariableNames in interface BasicFrameInt

add

public void add(java.lang.String name,
                VariableInt v)
         throws java.lang.Exception

get

public java.lang.Object[] get(int obsNo)

get

public DataFrame get(java.lang.String[] name)
              throws java.lang.Exception

get

public DataFrame get(int[] obsNos)
              throws java.lang.Exception

get

public DataFrame get(int[] obsNos,
                     java.lang.String[] name)
              throws java.lang.Exception

numObservations

public long numObservations()
Description copied from interface: BasicFrameInt
Return the number of observations available in each of the variables. This assumes that the variables have the same number of records/observations.
Specified by:
numObservations in interface BasicFrameInt

numVariables

public long numVariables()
Description copied from interface: BasicFrameInt
Return the number of variables contained in this frame.
Specified by:
numVariables in interface BasicFrameInt

getVariable

public VariableInt getVariable(java.lang.String name)
Description copied from interface: BasicFrameInt
Extract the specified variable identified by name.
Specified by:
getVariable in interface BasicFrameInt

addVariable

public VariableInt addVariable(java.lang.String name,
                               VariableInt var)
Description copied from interface: BasicFrameInt
Extract the specified variable identified by name.
Specified by:
addVariable in interface BasicFrameInt

setNames

public boolean setNames(java.util.Collection els)

toString

public java.lang.String toString()
Overrides:
toString in class java.util.Hashtable

getReadHeader

public boolean getReadHeader()
Accessor for readHeader field

setReadHeader

public boolean setReadHeader(boolean value)
Accessor for setting readHeader field

rowNames

public java.lang.String[] rowNames()
Description copied from interface: BasicFrameInt
Returns an array of the identifiers of the row names.
Specified by:
rowNames in interface BasicFrameInt

rowNames

public java.lang.String[] rowNames(java.lang.String[] value)

rowNames

public java.lang.String[] rowNames(java.util.Vector v)

metaData

public java.util.Hashtable metaData()
Accessor for metaData field
Specified by:
metaData in interface DataFrameInt

metaData

public java.util.Hashtable metaData(java.util.Hashtable value)
Accessor for setting metaData field

createVariable

protected VariableInt createVariable(java.lang.Object obj)
This creates a VariableInt object of the appropriate class based on the type of the argument obj. In this implementation, if this is a number, we create a RealVariable, and if it is a String we create a FactorVariable.

createVariables

protected int createVariables(int howMany)
Create generic RealVariable objects and give each the name Vari where i runs from 1 to howMany

createVariable

public VariableInt createVariable(int which)

createDefaultVariable

public VariableInt createDefaultVariable(java.lang.String name,
                                         int num)

createDefaultVariable

public VariableInt createDefaultVariable(java.lang.String name)

add

public void add(java.lang.String name)

rowNames

public java.lang.String[] rowNames(java.io.InputStream stream)
                            throws java.io.IOException
Specified by:
rowNames in interface BasicFrameInt

rowName

public java.lang.String rowName(java.lang.String value,
                                int which)
Specified by:
rowName in interface BasicFrameInt

subsetObservations

public BasicFrameInt subsetObservations(int[] whichRows)
Description copied from interface: DataFrameInt
Create a new data frame consisting of the rows identified in the array passed to this method.
Specified by:
subsetObservations in interface DataFrameInt

subsetObservations

public BasicFrameInt subsetObservations(int start,
                                        int which)
Description copied from interface: DataFrameInt
Get the subset of the data frame of the continguous rows starting at start and ending at which, inclusive.
Specified by:
subsetObservations in interface DataFrameInt

newRecord

public boolean newRecord(java.lang.Object value,
                         ObjectReader source)
Notification event called by a EventDataScanner when it has read a complete record.
Specified by:
newRecord in interface RecordStreamListener

add

public int add(java.util.Collection col)
Treat the entries in the collection as components of a single record and append them to the corresponding variables, using the default ordering of the variable names
See Also:
getVariableNames(), add(java.util.Iterator,String[])

add

public int add(java.util.Iterator els)
Process each element in the iterator as an element in a record by adding it to the variable identified by the corresponding variable identified using the default names.
See Also:
getVariableNames(), add(java.util.Iterator,String[])

add

public int add(java.util.Iterator els,
               java.lang.String[] names)
Process each element in the iterator by adding it to the variable identified by the corresponding element in name.

read

public long read(EventDataScanner scanner,
                 boolean readHeader)
Reads the records/observations for the data frame one at a time by being notified by the scanner. That this is separated from the constructors allows one to populate the data frame with variables before reading the actual record values.