• R/O
  • SSH

chkcsv: Commit

Default repository for chkcsv.py.


Commit MetaInfo

Revisiond4faed2f56388eed62a9e7876c968c611147a9e1 (tree)
Zeit2018-07-29 08:17:46
AutorDreas Nielsen <dreas.nielsen@gmai...>
CommiterDreas Nielsen

Log Message

First commit

Ändern Zusammenfassung

Diff

diff -r 000000000000 -r d4faed2f5638 .hgignore
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/.hgignore Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,7 @@
1+syntax=glob
2+MANIFEST
3+chkcsv.htm
4+.pypirc
5+dist/*
6+doc/build/*
7+test/*
diff -r 000000000000 -r d4faed2f5638 LICENSE.txt
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/LICENSE.txt Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,11 @@
1+chkcsv.py
2+Copyright (c) 2011, R.Dreas Nielsen
3+
4+This program is free software: you can redistribute it and/or modify it under
5+the terms of the GNU General Public License as published by the Free Software
6+Foundation, either version 3 of the License, or (at your option) any later
7+version. This program is distributed in the hope that it will be useful, but
8+WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
9+FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more
10+details. The GNU General Public License is available at
11+http://www.gnu.org/licenses/.
diff -r 000000000000 -r d4faed2f5638 MANIFEST.in
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/MANIFEST.in Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,2 @@
1+include README.txt
2+include LICENSE.txt
diff -r 000000000000 -r d4faed2f5638 README.txt
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/README.txt Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,196 @@
1+chkcsv.py.py
2+
3+chkcsv.py is a Python module and program that checks the format and content
4+of a comma-separated-value (CSV) or similar delimited text file. It can check
5+whether required columns are present, and the type, length, and pattern of
6+each column.
7+
8+
9+Syntax and Options
10+===================
11+
12+ chkcsv.py [options] <CSV file name>
13+
14+Arguments
15+----------
16+ <CSV file name> The name of the CSV file to check.
17+
18+Options
19+-------
20+ --version Show program's version number and exit
21+ -h, --help Show this help message and exit
22+ -s, --showspecs
23+ Show the format specifications allowed in the
24+ configuration file, and exit.
25+ -f FORMATSPEC, --formatspec=FORMATSPEC
26+ Name of the file with the format specification. The default
27+ is the name of the CSV file with an extension of fmt.
28+ -r, --required
29+ A data value is required in data columns for which the format
30+ specification does not include an explicit specification of
31+ whether data is required for a column. The default is false
32+ (i.e., data are not required).
33+ -q, --columnsnotrequired
34+ Columns listed in the format configuration file are not
35+ required to be present unless the column_required
36+ specification is explicitly set in the configuration file.
37+ The default is true (i.e., all columns in the configuration
38+ file are required in the CSV file).
39+ -c, --columnexit
40+ Exit immediately if there are more columns in the CSV file
41+ header than are specified in the format configuration file.
42+ -l, --linelength
43+ Allow rows of the CSV file to have fewer columns than in the
44+ column headers. The default is to report an error for short
45+ data rows. If short data rows are allowed, any row without
46+ enough columns to match the format specification will still
47+ be reported as an error.
48+ -i, --case-insensitive
49+ Case-insensitive matching of column names in the format
50+ configuration file and the CSV file. The default is
51+ case-sensitive (i.e., column names must match exactly).
52+ -e ENCODING, --encoding=ENCODING
53+ Character encoding of the CSV file. It should be one of the
54+ strings listed at
55+ http://docs.python.org/library/codecs.html#standard-encodings.
56+ -o OPTSECTION, --optsection=OPTSECTION
57+ An alternate name for the chkcsv options section in the
58+ format specification configuration file.
59+ -x, --exitonerror
60+ Exit when the first error is found.
61+
62+
63+Format Specifications
64+=====================
65+
66+The format of each of the columns of the CSV file is specified in a separate
67+configuration file containing a section for each column. Each section begins
68+with the column name in square brackets, followed by key-value pairs
69+identifying the specifications for that column. Each key-value pair consists
70+of a keyword and an associated value. Keywords and values should be separated
71+by either "=" or ":". Each keyword should be at the beginning of a line.
72+
73+By default, the configuration file has the same name as the CSV file, but
74+with an extension of ".fmt". An alternate configuration file can be specified
75+with the "-f" command-line option.
76+
77+The keywords that can be used for column format specifications are listed below.
78+A specific type of value should be provided for each keyword. Boolean values
79+are indicated by "Yes", "No", "True", "False", "On", "Off", "1", or "0". Format
80+specification keywords and values should not be quoted in the configuration
81+file. The allowable keywords are:
82+
83+column_required
84+ Indicates whether or not the column must be present in the CSV file.
85+ This is a Boolean value. The default value is True, and can be changed
86+ with the "-q" command-line option. This format option need be included
87+ in the format configuration file only when the default is to be overridden.
88+
89+data_required
90+ Indicates whether or not a value is required in this column on every row
91+ of the CSV file. This is a Boolean value. The default value is False,
92+ and can be changed with the "-r" command-line option. This format option
93+ need be included in the format configuration file only when the default
94+ is to be overridden.
95+
96+type
97+ Identifies the type of data in the data column. Valid values are "string",
98+ "integer", "float", "bool", "date", and "datetime". Data values in the
99+ CSV file will be checked for compatibility with the specified type. If
100+ the data type is not specified, data values will be treated as strings
101+ that is, minimum and maximum lengths and the pattern will be checked
102+ if they have been specified.
103+
104+minlen
105+ The required minimum length of data values for this column. This is
106+ only checked for string data types and for data with no type specified.
107+
108+maxlen
109+ The maximum allowed length of data values for this column. This is only
110+ checked for string data types and for data with no type specified.
111+
112+pattern
113+ A regular expression specifying the content of the column value.
114+ Patterns must match at the beginning of the column value. This is
115+ checked for string, date, and datetime data types, and for data with
116+ no type specified.
117+
118+
119+Usage Notes
120+===========
121+
122+ * The first line of the CSV file must contain the names of the columns.
123+
124+ * The order of column specifications in the configuration file does not
125+ have to match the order of columns in the CSV file.
126+
127+ * Format specification keywords for a column may be in any order within
128+ the column section in the configuration file
129+
130+ * Column names in the CSV file and in the configuration file are
131+ case-sensitive, and must match exactly by default. If column names
132+ in the configuration file and the CSV file don't match because the
133+ case is different, an error will be reported only if the unmatched
134+ column is required. The "-i" command-line option can be used to allow
135+ case-insensitive matching of column names.
136+
137+ * The pattern that a column should match is specified by a regular
138+ expression. The regular expression syntax supported by chkcsv.py is
139+ as documented at http://docs.python.org/library/re.html.
140+
141+ * Patterns (regular expressions) must match at the beginning of the
142+ column value. To ensure that the regular expression matches the
143+ entire column value, you may need to include "$" at the end of the
144+ regular expression.
145+
146+ * By default, all columns listed in the configuration file are
147+ considered to be required, and if the column name is not present
148+ in the CSV file (header row), this will be considered to be an error
149+ and chkcsv.py will halt immediately. The default behavior can be
150+ changed with the "-q" command-line option. If "-q" is used, or the
151+ "column_required" format specification is set to False, and the
152+ column is not present, no error will occur. If the column is present,
153+ any other format specifications will be applied. That is, even if a
154+ column is not required, if it is present and its data fails some other
155+ test, an error will be reported.
156+
157+ * chkcsv.py recognizes a wide variety of date and datetime formats.
158+ It may actually recognize a date or datetime format that the target
159+ software (e.g., a DBMS) does not. In this case, specifying a pattern
160+ for the date column can usefully restrict the types of date and
161+ datetime values that are accepted.
162+
163+ * The CSV file is expected to have the same number of data items on
164+ each row as there are column names in the first row of the file.
165+ If the "-l" command-line option is used, the CSV file may have
166+ varying numbers of data values in each row, as long as each row
167+ has enough values to correspond to each data column that will be
168+ checked. That is, if "-l" is used, and there are columns to the
169+ right of all of the required columns, those data items may or may
170+ not be present in any row without causing an error. However, if a
171+ row is short because a value in a required column is missing, and
172+ this omission does not cause any violation of any format
173+ specification, this error will not necessarily be recognized.
174+
175+ * chkcsv.py does not transform the input file in any way. It does
176+ not produce any output file or send any output to stdout except
177+ for help and version messages. chkcsv.py only writes error messages,
178+ if any, to stderr and sets the exit status value when it terminates.
179+
180+ * chkcsv.py is intended to verify that a data file is suitable for
181+ import to database, statistical, graphics, modeling, or other
182+ software. The checks that it can perform are generally sufficient
183+ to determine whether each data column is compatible with typical
184+ specifications for a database column. However, chkcsv.py does not
185+ do any row-level checks to verify that column values within a row
186+ are consistent with each other. Nor does it do any dataset-level
187+ checks to ensure, for example, that each row is unique.
188+
189+ * chkcsv.py includes a provision to allow additional options to be
190+ specified in a special section of the configuration file. By default,
191+ the name of this special section is "chkcsvoptions". A different
192+ name for this special section can be specified with the "-o"
193+ command-line argument. Currently there are no special options
194+ supported, and if this special section is present in the
195+ configuration file, it will be ignored.
196+
diff -r 000000000000 -r d4faed2f5638 chkcsv/chkcsv.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/chkcsv/chkcsv.py Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,518 @@
1+#! /usr/bin/python
2+# chkcsv.py
3+#
4+# PURPOSE:
5+# Check the contents of a CSV file, specifically that columns match a
6+# specified format.
7+#
8+# NOTES:
9+# 1. Column format specifications are stored in a configuration file
10+# with an INI-file format, where bracketed sections correspond to
11+# columns and each section contains key-value pairs of format specifications.
12+# 2. Recognized column specifications are:
13+# column_required=1|Yes|True|On|0|No|False|Off
14+# data_required=1|Yes|True|On|0|No|False|Off
15+# minlen=<integer>
16+# maxlen=<integer>
17+# type=integer|float|string|date|datetime|bool
18+# pattern=<regular expression identifying valid values>
19+# 3. Global options in the format specification file are not yet implemented,
20+# though a section name for them is reserved.
21+#
22+# COPYRIGHT:
23+# Copyright (c) 2011, R.Dreas Nielsen (RDN)
24+#
25+# LICENSE:
26+# GPL v.3
27+# This program is free software: you can redistribute it and/or
28+# modify it under the terms of the GNU General Public License as published
29+# by the Free Software Foundation, either version 3 of the License, or
30+# (at your option) any later version. This program is distributed in the
31+# hope that it will be useful, but WITHOUT ANY WARRANTY; without even
32+# the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR
33+# PURPOSE. See the GNU General Public License for more details. The GNU
34+# General Public License is available at http://www.gnu.org/licenses/.
35+#
36+# HISTORY:
37+# Date Remarks
38+# ---------- --------------------------------------------------------------
39+# 2011-09-25 First version. Version 0.8.0.0. RDN.
40+# ============================================================================
41+
42+_version = "0.8.0.0"
43+_vdate = "2011-09-24"
44+
45+import sys
46+from optparse import OptionParser
47+import ConfigParser
48+import codecs
49+import os.path
50+import csv
51+import re
52+import datetime
53+import types
54+import traceback
55+
56+FORMATSPECS = """Format specification options:
57+ column_required=1|Yes|True|On|0|No|False|Off
58+ type=integer|float|string|date|datetime|bool
59+ data_required=1|Yes|True|On|0|No|False|Off
60+ minlen=<integer>
61+ maxlen=<integer>
62+ pattern=<regular expression identifying valid values>
63+"""
64+
65+class ChkCsvError(Exception):
66+ """Base class for chkcsv errors."""
67+ def __init__(self, errmsg, infile=None, line=None, column=None):
68+ self.errmsg = errmsg
69+ self.infile = infile
70+ self.line = line
71+ self.column = column
72+
73+class CsvChecker():
74+ """Object to check a specific column of a defined type. After initialization, the 'check()'
75+ method will return a boolean indicating whether a data value is acceptable."""
76+ get_fn = {
77+ 'column_required' : ConfigParser.SafeConfigParser.getboolean,
78+ 'data_required' : ConfigParser.SafeConfigParser.getboolean,
79+ 'type' : ConfigParser.SafeConfigParser.get,
80+ 'minlen' : ConfigParser.SafeConfigParser.getint,
81+ 'maxlen' : ConfigParser.SafeConfigParser.getint,
82+ 'pattern' : ConfigParser.SafeConfigParser.get
83+ }
84+ datetime_fmts = ("%x",
85+ "%c",
86+ "%x %X",
87+ "%m/%d/%Y",
88+ "%m/%d/%y",
89+ "%m/%d/%Y %H%M",
90+ "%m/%d/%Y %I:%M %p",
91+ "%m/%d/%y %H%M",
92+ "%m/%d/%y %I:%M %p",
93+ "%Y-%m-%d %H%M",
94+ "%Y-%m-%d %I:%M %p",
95+ "%Y-%m-%d",
96+ "%Y/%m/%d %H%M",
97+ "%Y/%m/%d %I:%M %p",
98+ "%Y/%m/%d %X",
99+ "%Y/%m/%d",
100+ "%b %d, %Y",
101+ "%b %d, %Y %X",
102+ "%b %d, %Y %I:%M %p",
103+ "%b %d %Y",
104+ "%b %d %Y %X",
105+ "%b %d %Y %I:%M %p",
106+ "%d %b, %Y",
107+ "%d %b, %Y %X",
108+ "%d %b, %Y %I:%M %p",
109+ "%d %b %Y",
110+ "%d %b %Y %X",
111+ "%d %b %Y %I:%M %p",
112+ "%b. %d, %Y",
113+ "%b. %d, %Y %X",
114+ "%b. %d, %Y %I:%M %p",
115+ "%b. %d %Y",
116+ "%b. %d %Y %X",
117+ "%b. %d %Y %I:%M %p",
118+ "%d %b., %Y",
119+ "%d %b., %Y %X",
120+ "%d %b., %Y %I:%M %p",
121+ "%d %b. %Y",
122+ "%d %b. %Y %X",
123+ "%d %b. %Y %I:%M %p",
124+ "%Y",
125+ "%b %Y",
126+ "%b, %Y",
127+ "%b. %Y",
128+ "%b., %Y",
129+ "%b-%Y",
130+ "%b.-%Y",
131+ "%B %d, %Y",
132+ "%B %d, %Y %X",
133+ "%B %d, %Y %I:%M %p",
134+ "%B %d %Y",
135+ "%B %d %Y %X",
136+ "%B %d %Y %I:%M %p",
137+ "%d %B, %Y",
138+ "%d %B, %Y %X",
139+ "%d %B, %Y %I:%M %p",
140+ "%d %B %Y",
141+ "%d %B %Y %X",
142+ "%d %B %Y %I:%M %p",
143+ "%B %Y",
144+ "%B, %Y",
145+ "%B-%Y",
146+ )
147+ date_fmts = ("%x",
148+ "%c",
149+ "%x %X",
150+ "%m/%d/%Y",
151+ "%m/%d/%y",
152+ "%Y-%m-%d",
153+ "%Y/%m/%d",
154+ "%b %d, %Y",
155+ "%b %d %Y",
156+ "%d %b, %Y",
157+ "%d %b %Y",
158+ "%b. %d, %Y",
159+ "%b. %d %Y",
160+ "%d %b., %Y",
161+ "%d %b. %Y",
162+ "%Y",
163+ "%b %Y",
164+ "%b, %Y",
165+ "%b. %Y",
166+ "%b., %Y",
167+ "%b-%Y",
168+ "%b.-%Y",
169+ "%B %d, %Y",
170+ "%B %d %Y",
171+ "%d %B, %Y",
172+ "%d %B %Y",
173+ "%B %Y",
174+ "%B, %Y",
175+ "%B-%Y",
176+ )
177+ # Basic format checking functions. These return None if the data are acceptable,
178+ # a textual description of the problem otherwise.
179+ def chk_req(self, data):
180+ return "missing data" if len(data)==0 else None
181+ def chk_min(self, data):
182+ return None if (not self.data_required and len(data)==0) or \
183+ len(data) >= self.minlen else "data too short"
184+ def chk_max(self, data):
185+ return None if len(data) <= self.maxlen else "data too long"
186+ def chk_pat(self, data):
187+ return None if len(data)==0 or self.rx.match(data) else "pattern mismatch"
188+ def chk_int(self, data):
189+ if len(data)==0:
190+ return None
191+ try:
192+ x = int(data)
193+ return None
194+ except ValueError:
195+ return "not an integer"
196+ def chk_float(self, data):
197+ if len(data)==0:
198+ return None
199+ try:
200+ x = float(data)
201+ return None
202+ except ValueError:
203+ return "not a floating-point number"
204+ def chk_bool(self, data):
205+ if len(data)==0:
206+ return None
207+ return None if data in ('True', 'true', 'TRUE', 'T', 't', 'Yes', 'yes', 'YES', 'Y', 'y',
208+ 'False', 'false', 'FALSE', 'F', 'f',
209+ 'No', 'no', 'NO', 'N', 'n', True, False) else "unrecognized boolean"
210+ def chk_datetime(self, data):
211+ if len(data)==0:
212+ return None
213+ if type(data) == type(datetime.datetime.now()):
214+ return None
215+ if type(data) == type(datetime.date.today()):
216+ return None
217+ if type(data) != types.StringType:
218+ if data==None:
219+ return "missing date/time"
220+ try:
221+ data = str(data)
222+ except ValueError:
223+ return "can't convert data to string for date/time test"
224+ for f in self.datetime_fmts:
225+ try:
226+ dt = datetime.datetime.strptime(data, f)
227+ except:
228+ continue
229+ break
230+ else:
231+ return "invalid date/time"
232+ return None
233+ def chk_date(self, data):
234+ if len(data)==0:
235+ return None
236+ if type(data) == type(datetime.date.today()):
237+ return None
238+ if type(data) != types.StringType:
239+ if data==None:
240+ return "missing date"
241+ try:
242+ data = str(data)
243+ except ValueError:
244+ return "can't convert data to string for date test"
245+ for f in self.date_fmts:
246+ try:
247+ dt = datetime.datetime.strptime(data, f)
248+ except:
249+ continue
250+ break
251+ else:
252+ return "invalid date"
253+ return None
254+ def dispatch(self, check_funcs, data):
255+ errlist = [ f(data) for f in check_funcs ]
256+ return [ e for e in errlist if e ]
257+ def __init__(self, fmt_spec, colname, column_required_default, data_required_default):
258+ self.name = colname
259+ self.data_required = data_required_default
260+ # By default, all columns are required unless there is a specification indicating that it is not.
261+ self.column_required = column_required_default
262+ specs = fmt_spec.options(colname)
263+ # Get the value for each option, using an appropriate function for each expected value type.
264+ for spec in specs:
265+ try:
266+ specval = self.get_fn[spec](fmt_spec, colname, spec)
267+ except KeyError:
268+ raise ChkCsvError('Unrecognized format specification (%s)' % spec, column=colname)
269+ setattr(self, spec, specval)
270+ # Convert any pattern attribute to an rx attribute
271+ if hasattr(self, 'pattern'):
272+ try:
273+ self.rx = re.compile(self.pattern)
274+ except:
275+ raise ChkCsvError("Invalid regular expression pattern: %s" % self.pattern, column=colname)
276+ # Create the check method
277+ errfuncs = []
278+ if self.data_required:
279+ errfuncs.append(self.chk_req)
280+ if hasattr(self, 'type'):
281+ if self.type == 'string':
282+ if hasattr(self, 'minlen'):
283+ errfuncs.append(self.chk_min)
284+ if hasattr(self, 'maxlen'):
285+ errfuncs.append(self.chk_max)
286+ if hasattr(self, 'pattern'):
287+ errfuncs.append(self.chk_pat)
288+ elif self.type == 'integer':
289+ errfuncs.append(self.chk_int)
290+ elif self.type == 'float':
291+ errfuncs.append(self.chk_float)
292+ elif self.type == 'date':
293+ errfuncs.append(self.chk_date)
294+ if hasattr(self, 'pattern'):
295+ errfuncs.append(self.chk_pat)
296+ elif self.type == 'datetime':
297+ errfuncs.append(self.chk_datetime)
298+ if hasattr(self, 'pattern'):
299+ errfuncs.append(self.chk_pat)
300+ else:
301+ if hasattr(self, 'minlen'):
302+ errfuncs.append(self.chk_min)
303+ if hasattr(self, 'maxlen'):
304+ errfuncs.append(self.chk_max)
305+ if hasattr(self, 'pattern'):
306+ errfuncs.append(self.chk_pat)
307+ self.check = lambda data: self.dispatch(errfuncs, data)
308+
309+
310+def clparser():
311+ usage_msg = """Usage: %prog [options] <CSV file name>
312+Arguments:
313+ CSV file name The name of a comma-separated-values file to check."""
314+ vers_msg = "%prog " + "%s %s" % (_version, _vdate)
315+ desc_msg = "Checks the content and format of a CSV file."
316+ parser = OptionParser(usage=usage_msg, version=vers_msg, description=desc_msg)
317+ parser.add_option("-s", "--showspecs", action="store_true", dest="showspecs",
318+ default=False,
319+ help="Show the format specifications allowed in the configuration file, and exit.")
320+ parser.add_option("-f", "--formatspec",
321+ action="store", dest="formatspec",
322+ type="string",
323+ help="Name of the file with the format specification. The default is the name of the CSV file with an extension of fmt.")
324+ parser.add_option("-r", "--required", action="store_true", dest="data_required",
325+ default=False,
326+ help="A data value is required in data columns for which the format specification does not include an explicit specification of whether data is required for a column. The default is false (i.e., data are not required).")
327+ parser.add_option("-q", "--columnsnotrequired", action="store_false", dest="column_required",
328+ default=True,
329+ help="Columns listed in the format configuration file are not required to be present unless the column_required specification is explicitly set in the configuration file. The default is true (i.e., all columns in the configuration file are required in the CSV file).")
330+ parser.add_option("-c", "--columnexit", action="store_true", dest="columnexit",
331+ default=False,
332+ help="Exit immediately if there are more columns in the CSV file header than are specified in the format configuration file.")
333+ parser.add_option("-l", "--linelength", action="store_false", dest="linelength",
334+ default=True,
335+ help="Allow rows of the CSV file to have fewer columns than in the column headers. The default is to report an error for short data rows. If short data rows are allowed, any row without enough columns to match the format specification will still be reported as an error.")
336+ parser.add_option("-i", "--case-insensitive", action="store_true", dest="caseinsensitive",
337+ default=False,
338+ help="Case-insensitive matching of column names in the format configuration file and the CSV file. The default is case-sensitive (i.e., column names must match exactly).")
339+ parser.add_option("-e", "--encoding", action="store", type="string", dest="encoding",
340+ default=None,
341+ help="Character encoding of the CSV file. It should be one of the strings listed at http://docs.python.org/library/codecs.html#standard-encodings.")
342+ parser.add_option("-o", "--optsection", action="store", dest="optsection",
343+ type="string",
344+ help="An alternate name for the chkcsv options section in the format specification configuration file.")
345+ parser.add_option("-x", "--exitonerror",
346+ action="store_true", dest="haltonerror",
347+ default=False,
348+ help="Exit when the first error is found.")
349+ return parser
350+
351+class UTF8Recoder:
352+ """Iterator that reads an encoded stream and reencodes the input to UTF-8."""
353+ def __init__(self, f, encoding):
354+ self.reader = codecs.getreader(encoding)(f)
355+ def __iter__(self):
356+ return self
357+ def next(self):
358+ return self.reader.next().encode('utf-8')
359+
360+class UnicodeReader:
361+ """A CSV reader which will iterate over lines in the CSV file "f",
362+ which is encoded in the given encoding."""
363+ def __init__(self, f, dialect=csv.excel, encoding="utf-8", **kwds):
364+ f = UTF8Recoder(f, encoding)
365+ self.reader = csv.reader(f, dialect=dialect, **kwds)
366+ def next(self):
367+ row = self.reader.next()
368+ return [unicode(s, "utf-8") for s in row]
369+ def __iter__(self):
370+ return self
371+
372+def show_errors(errlist):
373+ """Items in errlist are a tuple of a narrative message, the name of the file
374+ in which the error occurred, the line number of the file, and the column
375+ name of the file. All but the first may be null."""
376+ for err in errlist:
377+ sys.stderr.write("%s.\n" % " ".join([ "%s %s" % em for em in [ e for e in
378+ zip(("Error:", "in file", "on line", "in column"), err) if e[1]]]))
379+
380+
381+def check_csv_file(csv_fname, cols, halt_on_err, columnexit, \
382+ linelength, caseinsensitive, encoding=None):
383+ """Check that all of the required columns and data are present in the CSV file, and that
384+ the data conform to the appropriate type and other specification.
385+ Arguments are: 1. The name of the CSV file to check; 2. A dictionary of
386+ specifications (ChkCsv objects) indexed by column name; 3. Whether to exit
387+ on the first error; 4. Whether to exit if the CSV file doesn't have
388+ exactly the same columns in the format specifications; 5. Whether to
389+ report an error if any data row has a different number of items than indicated
390+ by the column headers; 6. Whether column names in the specifications and
391+ CSV file should be compared case-insensitive; 7. The character encoding of
392+ the CSV file.
393+ """
394+ dialect = csv.Sniffer().sniff(open(csv_fname, "rt").readline())
395+ if encoding:
396+ inf = UnicodeReader(open(csv_fname, "rt"), dialect, encoding)
397+ else:
398+ inf = csv.reader(open(csv_fname, "rt"), dialect)
399+ colnames = inf.next()
400+ req_cols = [ c for c in cols if cols[c].column_required ]
401+ # Exit if all required columns are not present
402+ if caseinsensitive:
403+ colnames_l = [ c.lower() for c in colnames ]
404+ req_missing = [ col for col in req_cols if not (col.lower() in colnames_l) ]
405+ else:
406+ req_missing = [ col for col in req_cols if not (col in colnames) ]
407+ if len(req_missing) > 0:
408+ raise ChkCsvError("The following columns are required, but are not present in the CSV file: %s." % ", ".join(req_missing), csv_fname, 1)
409+ # Exit if there are extra columns and the option to exit is set.
410+ if columnexit:
411+ if caseinsensitive:
412+ speccols_l = [ c.lower() for c in cols ]
413+ extra = [ col for col in colnames if not (col.lower() in speccols_l) ]
414+ else:
415+ extra = [ col for col in colnames if not (col in cols) ]
416+ if len(extra) > 0:
417+ raise ChkCsvError("The following columns have no format specifications but are in the CSV file: %s." % ", ".join(extra), csv_fname, 1)
418+ # Column names common to specifications and data file. These will be used
419+ # to index the cols dictionary to get the appropriate check method
420+ # and to index the CSV column name list (colnames) to get the column position.
421+ if caseinsensitive:
422+ chkcols = {}
423+ for x in cols:
424+ for y in colnames:
425+ if x.lower() == y.lower():
426+ chkcols[x] = y
427+ else:
428+ datacols = [ col for col in cols if col in colnames ]
429+ chkcols = dict(zip(datacols, datacols))
430+ # Get maximum required column number (index) to check data rows
431+ dataindex = [ colnames.index(chkcols[col]) for col in chkcols ]
432+ maxindex = max(dataindex) if len(dataindex) > 0 else 0 # 0 if format file is empty
433+ colloc = dict(zip([ chkcols[c] for c in chkcols ], dataindex))
434+ # Read and check the CSV file until done (or until an error).
435+ errorlist = []
436+ row_no = 1 # Header is row 1.
437+ for datarow in inf:
438+ row_no += 1
439+ if (len(datarow) > 0) and (len(datarow) < len(colnames)) and linelength:
440+ errorlist.append(("fewer data values than column headers", csv_fname, row_no))
441+ if len(datarow) < maxindex + 1:
442+ if len(datarow) > 0:
443+ errorlist.append(("fewer data values than columns in the format specification", csv_fname, row_no))
444+ if halt_on_err:
445+ return errorlist
446+ else:
447+ for col in chkcols:
448+ col_errs = cols[col].check(datarow[colloc[chkcols[col]]])
449+ if len(col_errs) > 0:
450+ errorlist.extend([ (e, csv_fname, row_no, cols[col].name) for e in col_errs ])
451+ if halt_on_err:
452+ return errorlist
453+ return errorlist
454+
455+
456+def main():
457+ parser = clparser()
458+ (opts, args) = parser.parse_args()
459+ if opts.showspecs:
460+ print(FORMATSPECS)
461+ return 0
462+ if len(args)==0:
463+ parser.print_help()
464+ return 0
465+ if len(args) <> 1:
466+ raise ChkCsvError("A single argument, the name of the CSV file to check, must be provided.")
467+ csv_file = args[0]
468+ if not os.path.exists(csv_file):
469+ raise ChkCsvError("The specified CSV file does not exist.", csv_file)
470+ if opts.formatspec:
471+ fmt_file = opts.formatspec
472+ else:
473+ (fn, ext) = os.path.splitext(csv_file)
474+ fmt_file = "%s.fmt" % fn
475+ if not os.path.exists(fmt_file):
476+ raise ChkCsvError("The format file does not exist.", fmt_file)
477+ fmtspecs = ConfigParser.SafeConfigParser()
478+ try:
479+ files_read = fmtspecs.read([fmt_file])
480+ except ConfigParser.Error:
481+ raise ChkCsvError("Error reading format specification file.", fmt_file)
482+ if len(files_read) == 0:
483+ raise ChkCsvError("Error reading format specification file.", fmt_file)
484+ if opts.optsection:
485+ chkopts = opts.optsection
486+ else:
487+ chkopts = "chkcsvoptions"
488+ # Convert ConfigParser object into a list of CsvChecker objects
489+ speccols = [ sect for sect in fmtspecs.sections() if sect <> chkopts ]
490+ cols = {}
491+ for col in speccols:
492+ cols[col] = CsvChecker(fmtspecs, col, opts.column_required, opts.data_required)
493+ # Check the file
494+ errorlist = check_csv_file(csv_file, cols, opts.haltonerror,
495+ opts.columnexit, opts.linelength, opts.caseinsensitive, opts.encoding)
496+ if len(errorlist) > 0:
497+ show_errors(errorlist)
498+ return 1
499+ else:
500+ return 0
501+
502+
503+if __name__=='__main__':
504+ try:
505+ status = main()
506+ except ChkCsvError, msg:
507+ show_errors( [ (msg.errmsg, msg.infile, msg.line, msg.column) ] )
508+ exit(1)
509+ except SystemExit, x:
510+ sys.exit(x)
511+ except Exception:
512+ strace = traceback.extract_tb(sys.exc_info()[2])[-1:]
513+ lno = strace[0][1]
514+ src = strace[0][3]
515+ sys.stderr.write("%s: Uncaught exception %s (%s) on line %s (%s)." % (os.path.basename(sys.argv[0]), str(sys.exc_info()[0]), sys.exc_info()[1], lno, src))
516+ sys.exit(1)
517+ sys.exit(status)
518+
diff -r 000000000000 -r d4faed2f5638 doc/Makefile
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/Makefile Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,20 @@
1+# Minimal makefile for Sphinx documentation
2+#
3+
4+# You can set these variables from the command line.
5+SPHINXOPTS =
6+SPHINXBUILD = sphinx-build
7+SPHINXPROJ = chkcsv
8+SOURCEDIR = source
9+BUILDDIR = build
10+
11+# Put it first so that "make" without argument is like "make help".
12+help:
13+ @$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
14+
15+.PHONY: help Makefile
16+
17+# Catch-all target: route all unknown targets to Sphinx using the new
18+# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
19+%: Makefile
20+ @$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
\ No newline at end of file
diff -r 000000000000 -r d4faed2f5638 doc/source/conf.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/source/conf.py Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,163 @@
1+# -*- coding: utf-8 -*-
2+#
3+# chkcsv documentation build configuration file, created by
4+# sphinx-quickstart on Sat Jul 28 15:25:34 2018.
5+#
6+# This file is execfile()d with the current directory set to its
7+# containing dir.
8+#
9+# Note that not all possible configuration values are present in this
10+# autogenerated file.
11+#
12+# All configuration values have a default; values that are commented out
13+# serve to show the default.
14+
15+# If extensions (or modules to document with autodoc) are in another directory,
16+# add these directories to sys.path here. If the directory is relative to the
17+# documentation root, use os.path.abspath to make it absolute, like shown here.
18+#
19+# import os
20+# import sys
21+# sys.path.insert(0, os.path.abspath('.'))
22+
23+
24+# -- General configuration ------------------------------------------------
25+
26+# If your documentation needs a minimal Sphinx version, state it here.
27+#
28+# needs_sphinx = '1.0'
29+
30+# Add any Sphinx extension module names here, as strings. They can be
31+# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
32+# ones.
33+extensions = ['sphinx.ext.autodoc',
34+ 'sphinx.ext.ifconfig']
35+
36+# Add any paths that contain templates here, relative to this directory.
37+templates_path = ['_templates']
38+
39+# The suffix(es) of source filenames.
40+# You can specify multiple suffix as a list of string:
41+#
42+# source_suffix = ['.rst', '.md']
43+source_suffix = '.rst'
44+
45+# The master toctree document.
46+master_doc = 'index'
47+
48+# General information about the project.
49+project = u'chkcsv'
50+copyright = u'2011, Dreas Nielsen'
51+author = u'Dreas Nielsen'
52+
53+# The version info for the project you're documenting, acts as replacement for
54+# |version| and |release|, also used in various other places throughout the
55+# built documents.
56+#
57+# The short X.Y version.
58+version = u'0.8'
59+# The full version, including alpha/beta/rc tags.
60+release = u'0.8.0'
61+
62+# The language for content autogenerated by Sphinx. Refer to documentation
63+# for a list of supported languages.
64+#
65+# This is also used if you do content translation via gettext catalogs.
66+# Usually you set "language" from the command line for these cases.
67+language = None
68+
69+# List of patterns, relative to source directory, that match files and
70+# directories to ignore when looking for source files.
71+# This patterns also effect to html_static_path and html_extra_path
72+exclude_patterns = []
73+
74+# The name of the Pygments (syntax highlighting) style to use.
75+pygments_style = 'sphinx'
76+
77+# If true, `todo` and `todoList` produce output, else they produce nothing.
78+todo_include_todos = False
79+
80+
81+# -- Options for HTML output ----------------------------------------------
82+
83+# The theme to use for HTML and HTML Help pages. See the documentation for
84+# a list of builtin themes.
85+#
86+html_theme_path = ["themes"]
87+html_theme = 'sunwood'
88+
89+# Theme options are theme-specific and customize the look and feel of a theme
90+# further. For a list of options available for each theme, see the
91+# documentation.
92+#
93+# html_theme_options = {}
94+
95+# Add any paths that contain custom static files (such as style sheets) here,
96+# relative to this directory. They are copied after the builtin static files,
97+# so a file named "default.css" will overwrite the builtin "default.css".
98+html_static_path = ['_static']
99+
100+# Custom sidebar templates, must be a dictionary that maps document names
101+# to template names.
102+#
103+html_sidebars = { '**': ['localtoc.html', 'relations.html', 'searchbox.html'] }
104+
105+
106+# -- Options for HTMLHelp output ------------------------------------------
107+
108+# Output file base name for HTML help builder.
109+htmlhelp_basename = 'chkcsvdoc'
110+
111+
112+# -- Options for LaTeX output ---------------------------------------------
113+
114+latex_elements = {
115+ # The paper size ('letterpaper' or 'a4paper').
116+ #
117+ # 'papersize': 'letterpaper',
118+
119+ # The font size ('10pt', '11pt' or '12pt').
120+ #
121+ # 'pointsize': '10pt',
122+
123+ # Additional stuff for the LaTeX preamble.
124+ #
125+ # 'preamble': '',
126+
127+ # Latex figure (float) alignment
128+ #
129+ # 'figure_align': 'htbp',
130+}
131+
132+# Grouping the document tree into LaTeX files. List of tuples
133+# (source start file, target name, title,
134+# author, documentclass [howto, manual, or own class]).
135+latex_documents = [
136+ (master_doc, 'chkcsv.tex', u'chkcsv Documentation',
137+ u'Dreas Nielsen', 'manual'),
138+]
139+
140+
141+# -- Options for manual page output ---------------------------------------
142+
143+# One entry per manual page. List of tuples
144+# (source start file, name, description, authors, manual section).
145+man_pages = [
146+ (master_doc, 'chkcsv', u'chkcsv Documentation',
147+ [author], 1)
148+]
149+
150+
151+# -- Options for Texinfo output -------------------------------------------
152+
153+# Grouping the document tree into Texinfo files. List of tuples
154+# (source start file, target name, title, author,
155+# dir menu entry, description, category)
156+texinfo_documents = [
157+ (master_doc, 'chkcsv', u'chkcsv Documentation',
158+ author, 'chkcsv', 'One line description of project.',
159+ 'Miscellaneous'),
160+]
161+
162+
163+
diff -r 000000000000 -r d4faed2f5638 doc/source/index.rst
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/source/index.rst Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,290 @@
1+.. chkcsv documentation master file
2+
3+Check the Format and Content of a Delimited Text File
4+======================================================
5+
6+``chkcsv.py`` is a Python module and program that checks the format
7+and content of a comma-separated-value (CSV) or similar delimited text
8+file. It can check whether required columns are present, and the type,
9+length, and pattern of each column.
10+
11+
12+Syntax and Options
13+=========================
14+
15+.. highlight:: sh
16+
17+chkcsv.py should be run at the operating-system commannd line--i.e., at a shell prompt
18+in Linux or in a command window in Windows. Python may or may not need to be explicitly
19+invoked, and the .py extension may or may not need to be included, depending on your
20+operating system, operating system seetings, and how execsql is
21+:ref:`installed <availability>`.
22+
23+For Linux users: The chkcsv.py file contains a shebang line pointing to /usr/bin/python,
24+so there should be no need to invoke the Python interpreter. Depending on how
25+chkcsv.py was obtained and installed, it may need to be made executable with
26+the *chmod* command.
27+
28+For Windows users: If you are unfamiliar with running Python programs at the
29+command prompt, see https://docs.python.org/2/faq/windows.html.
30+
31+In the following syntax descriptions, angle brackets identify required replaceable
32+elements, and square brackets identify optional replaceable elements.
33+
34+.. code-block:: none
35+
36+ chkcsv.py [options] <CSV file name>
37+ Arguments:
38+ <CSV file name> The name of the CSV file to check.
39+ Options:
40+ --version Show program's version number and exit
41+ -h, --help Show this help message and exit
42+ -s, --showspecs Show the format specifications allowed in the
43+ configuration file, and exit.
44+ -f FORMATSPEC, --formatspec=FORMATSPEC
45+ Name of the file with the format specification.
46+ The default is the name of the CSV file with an
47+ extension of fmt.
48+ -r, --required A data value is required in data columns for
49+ which the format specification does not include
50+ an explicit specification of whether data is
51+ required for a column. The default is false
52+ (i.e., data are not required).
53+ -q, --columnsnotrequired
54+ Columns listed in the format configuration file
55+ are not required to be present unless the
56+ column_required specification is explicitly set
57+ in the configuration file. The default is true
58+ (i.e., all columns in the configuration file
59+ are required in the CSV file).
60+ -c, --columnexit Exit immediately if there are more columns in
61+ the CSV file header than are specified in the
62+ format configuration file.
63+ -l, --linelength Allow rows of the CSV file to have fewer columns
64+ than in the column headers. The default is to
65+ report an error for short data rows. If short
66+ data rows are allowed, any row without enough
67+ columns to match the format specification will
68+ still be reported as an error.
69+ -i, --case-insensitive
70+ Case-insensitive matching of column names in
71+ the format configuration file and the CSV file.
72+ The default is case-sensitive (i.e., column
73+ names must match exactly).
74+ -e ENCODING, --encoding=ENCODING
75+ Character encoding of the CSV file. It should
76+ be one of the strings listed at
77+ http://docs.python.org/library/codecs.html#standard- encodings.
78+ -o OPTSECTION, --optsection=OPTSECTION
79+ An alternate name for the chkcsv options section
80+ in the format specification configuration file.
81+ -x, --exitonerror Exit when the first error is found.
82+
83+
84+Format Specifications
85+============================
86+
87+The format of each of the columns of the CSV file is specified in a
88+separate configuration file containing a section for each column. Each
89+section begins with the column name in square brackets, followed by
90+key-value pairs identifying the specifications for that column. Each
91+key-value pair consists of a keyword and an associated value. Keywords
92+and values should be separated by either "=" or ":". Each keyword
93+should be at the beginning of a line.
94+
95+By default, the configuration file has the same name as the CSV file,
96+but with an extension of ".fmt". An alternate configuration file can be
97+specified with the "-f" command-line option.
98+
99+The keywords that can be used for column format specifications are
100+listed below. A specific type of value should be provided for each
101+keyword. Boolean values are indicated by "Yes", "No", "True", "False",
102+"On", "Off", "1", or "0". Format specification keywords and values
103+should not be quoted in the configuration file. The allowable keywords
104+are:
105+
106+column_required
107+ Indicates whether or not the column must be present in the CSV
108+ file. This is a Boolean value. The default value is True, and can
109+ be changed with the "-q" command-line option. This format option
110+ need be included in the format configuration file only when the
111+ default is to be overridden.
112+
113+data_required
114+ Indicates whether or not a value is required in this column on
115+ every row of the CSV file. This is a Boolean value. The default
116+ value is False, and can be changed with the "-r" command-line
117+ option. This format option need be included in the format
118+ configuration file only when the default is to be overridden.
119+
120+type
121+ Identifies the type of data in the data column. Valid values are
122+ "string", "integer", "float", "bool", "date", and "datetime". Data
123+ values in the CSV file will be checked for compatibility with the
124+ specified type. If the data type is not specified, data values will
125+ be treated as strings—that is, minimum and maximum lengths and the
126+ pattern will be checked if they have been specified.
127+
128+minlen
129+ The required minimum length of data values for this column. This is
130+ only checked for string data types and for data with no type
131+ specified.
132+
133+maxlen
134+ The maximum allowed length of data values for this column. This is
135+ only checked for string data types and for data with no type
136+ specified.
137+
138+pattern
139+ A regular expression specifying the content of the column value.
140+ Patterns must match at the beginning of the column value. This is
141+ checked for string, date, and datetime data types, and for data
142+ with no type specified.
143+
144+
145+Usage Notes
146+===========================
147+
148+ * The first line of the CSV file must contain the names of the columns.
149+
150+ * The order of column specifications in the configuration file does
151+ not have to match the order of columns in the CSV file.
152+
153+ * Format specification keywords for a column may be in any order
154+ within the column section in the configuration file
155+
156+ * Column names in the CSV file and in the configuration file are
157+ case-sensitive, and must match exactly by default. If column names
158+ in the configuration file and the CSV file don't match because the
159+ case is different, an error will be reported only if the unmatched
160+ column is required. The "-i" command-line option can be used to
161+ allow case-insensitive matching of column names.
162+
163+ * The pattern that a column should match is specified by a regular
164+ expression. The regular expression syntax supported by chkcsv.py is
165+ as documented at http://docs.python.org/library/re.html.
166+
167+ * Patterns (regular expressions) must match at the beginning of the
168+ column value. To ensure that the regular expression matches the
169+ entire column value, you may need to include "$" at the end of the
170+ regular expression.
171+
172+ * By default, all columns listed in the configuration file are
173+ considered to be required, and if the column name is not present in
174+ the CSV file (header row), this will be considered to be an error
175+ and chkcsv.py will halt immediately. The default behavior can be
176+ changed with the "-q" command-line option. If "-q" is used, or the
177+ "column_required" format specification is set to False, and the
178+ column is not present, no error will occur. If the column is
179+ present, any other format specifications will be applied. That is,
180+ even if a column is not required, if it is present and its data
181+ fails some other test, an error will be reported.
182+
183+ * chkcsv.py recognizes a wide variety of date and datetime formats.
184+ It may actually recognize a date or datetime format that the target
185+ software (e.g., a DBMS) does not. In this case, specifying a
186+ pattern for the date column can usefully restrict the types of date
187+ and datetime values that are accepted.
188+
189+ * The CSV file is expected to have the same number of data items on
190+ each row as there are column names in the first row of the file. If
191+ the "-l" command-line option is used, the CSV file may have varying
192+ numbers of data values in each row, as long as each row has enough
193+ values to correspond to each data column that will be checked. That
194+ is, if "-l" is used, and there are columns to the right of all of
195+ the required columns, those data items may or may not be present in
196+ any row without causing an error. However, if a row is short
197+ because a value in a required column is missing, and this omission
198+ does not cause any violation of any format specification, this
199+ error will not necessarily be recognized.
200+
201+ * chkcsv.py does not transform the input file in any way. It does
202+ not produce any output file or send any output to stdout except for
203+ help and version messages. chkcsv.py only writes error messages, if
204+ any, to stderr and sets the exit status value when it terminates.
205+
206+ * chkcsv.py is intended to verify that a data file is suitable for
207+ import to database, statistical, graphics, modeling, or other
208+ software. The checks that it can perform are generally sufficient
209+ to determine whether each data column is compatible with typical
210+ specifications for a database column. However, chkcsv.py does not
211+ do any row-level checks to verify that column values within a row
212+ are consistent with each other. Nor does it do any dataset-level
213+ checks to ensure, for example, that each row is unique.
214+
215+ * chkcsv.py includes a provision to allow additional options to be
216+ specified in a special section of the configuration file. By
217+ default, the name of this special section is "chkcsvoptions". A
218+ different name for this special section can be specified with the
219+ "-o" command-line argument. Currently there are no special options
220+ supported, and if this special section is present in the
221+ configuration file, it will be ignored.
222+
223+
224+Example
225+================================
226+
227+An example configuration file might look like this:
228+
229+.. code-block:: none
230+
231+ [Study]
232+ data_required=True
233+ type=string
234+ minlen=5
235+ maxlen=20
236+ [Station]
237+ data_required=True
238+ type=string
239+ minlen=4
240+ maxlen=12
241+ [SampleDate]
242+ type=date
243+ [Sample]
244+ type=string
245+ data_required=True
246+ minlen=4
247+ maxlen=20
248+ pattern=(SO|SD|WA).*
249+ [Description]
250+ type=string
251+ column_required=False
252+ maxlen=120
253+ [UpperDepth]
254+ type=float
255+ data_required=True
256+ [LowerDepth]
257+ type=float
258+ [DepthUnits]
259+ type=string
260+ data_required=True
261+ pattern=(?i)(FT|M|CM)$
262+
263+
264+.. _availability:
265+
266+Availability
267+================================
268+
269+
270+The chkcsv program is available on `PyPi <https://pypi.org/project/chkcsv/>`_.
271+It can be installed with:
272+
273+.. code-block:: none
274+
275+ pip install chkcsv
276+
277+
278+Copyright and License
279+================================
280+
281+Copyright (c) 2011, R.Dreas Nielsen
282+
283+This program is free software: you can redistribute it and/or modify it
284+under the terms of the GNU General Public License as published by the
285+Free Software Foundation, either version 3 of the License, or (at your
286+option) any later version. This program is distributed in the hope that
287+it will be useful, but WITHOUT ANY WARRANTY; without even the implied
288+warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See
289+the GNU General Public License for more details. The GNU General Public
290+License is available at http://www.gnu.org/licenses/.
diff -r 000000000000 -r d4faed2f5638 doc/source/themes/sunwood/layout.html
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/source/themes/sunwood/layout.html Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,48 @@
1+{#
2+ sunwood/layout.html
3+ ~~~~~~~~~~~~~~~~~
4+
5+ Sphinx layout template for the sunwood theme.
6+ Modified from the agogo theme, written by Andi Albrecht.
7+
8+ :copyright: Copyright 2017, R. Dreas Nielsen
9+ :license: BSD
10+ agogo theme Copyright 2007-2016 by the Sphinx team, see AUTHORS.
11+#}
12+{%- extends "basic/layout.html" %}
13+
14+{% block header %}
15+ <div class="header-wrapper" role="banner">
16+ <div class="header">
17+ </div>
18+ {%- block headertitle %}
19+ <div class="headertitle"><a
20+ href="{{ pathto(master_doc) }}">{{ shorttitle|e }}</a>
21+ </div>
22+ {%- endblock %}
23+ </div>
24+{% endblock %}
25+
26+{% block content %}
27+ <div class="content-wrapper">
28+ <div class="content">
29+ <div class="document">
30+ {%- block document %}
31+ {{ super() }}
32+ {%- endblock %}
33+ </div>
34+ <div class="clearer"></div>
35+ </div>
36+ </div>
37+{% endblock %}
38+
39+{% block footer %}
40+ <div class="footer-wrapper">
41+ <div class="footer">
42+ <div class="right">{{ super() }}</div>
43+ </div>
44+ </div>
45+{% endblock %}
46+
47+{% block relbar1 %}{% endblock %}
48+{% block relbar2 %}{% endblock %}
diff -r 000000000000 -r d4faed2f5638 doc/source/themes/sunwood/static/sunwood.css_t
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/source/themes/sunwood/static/sunwood.css_t Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,654 @@
1+/*
2+ * sunwood.css_t
3+ * ~~~~~~~~~~~
4+ *
5+ * Sphinx stylesheet -- sunwood theme.
6+ *
7+ * :copyright: Copyright 2017, R. Dreas Nielsen
8+ * :license: BSD
9+ *
10+ * Adapted from the agogo theme.
11+ * agogo theme copyright: Copyright 2007-2016 by the Sphinx team, see AUTHORS.
12+ * agogo license: BSD, see LICENSE for details.
13+ *
14+ */
15+
16+* {
17+ margin: 0px;
18+ padding: 0px;
19+}
20+
21+body {
22+ font-family: {{ theme_bodyfont }};
23+ line-height: 1.4em;
24+ color: black;
25+ background-color: #f2ecd1;
26+}
27+
28+
29+/* Page layout */
30+
31+div.header, div.content, div.footer {
32+ width: {{ theme_pagewidth }};
33+ margin-left: 1em;
34+ margin-right: 0;
35+ background-color: #f2ecd1;
36+}
37+
38+div.header-wrapper {
39+ background-color: #f2ecd1;
40+ height: 7em;
41+}
42+
43+
44+/* Default body styles */
45+a {
46+ color: {{ theme_linkcolor }};
47+ text-decoration: none;
48+}
49+a:link, a:visited, a:active {
50+ color: {{ theme_linkcolor }};
51+ text-decoration: none;
52+ }
53+a:hover {
54+ color: #4E2A16;
55+ }
56+
57+div.bodywrapper a, div.footer a {
58+ text-decoration: none;
59+}
60+
61+div.bodywrapper {
62+ background-color: #F3F1E2;
63+ padding: 0.5em 0.5em;
64+ padding-left: 3em;
65+ border: 1px solid #814324;
66+ }
67+div.bodywrapper h1, div.bodywrapper h2 {
68+ margin-top: 0;
69+ margin-left: -1.0em;
70+ }
71+
72+
73+.clearer {
74+ clear: both;
75+}
76+
77+.left {
78+ float: left;
79+}
80+
81+.right {
82+ float: right;
83+}
84+
85+.line-block {
86+ display: block;
87+ margin-top: 1em;
88+ margin-bottom: 1em;
89+}
90+
91+.line-block .line-block {
92+ margin-top: 0;
93+ margin-bottom: 0;
94+ margin-left: 1.5em;
95+}
96+
97+h1, h2, h3, h4 {
98+ font-family: {{ theme_headerfont }};
99+ font-weight: normal;
100+ color: {{ theme_headercolor2 }};
101+ margin-bottom: .8em;
102+}
103+
104+h1 {
105+ color: {{ theme_headercolor1 }};
106+ border-bottom: 2px solid {{ theme_headercolor1 }};
107+ font-size: 1.5em;
108+}
109+
110+h2 {
111+ padding-bottom: .5em;
112+ border-bottom: 1px solid {{ theme_headercolor2 }};
113+ font-size: 1.2em;
114+}
115+
116+a.headerlink {
117+ visibility: hidden;
118+ color: #dddddd;
119+ padding-left: .3em;
120+}
121+
122+h1:hover > a.headerlink,
123+h2:hover > a.headerlink,
124+h3:hover > a.headerlink,
125+h4:hover > a.headerlink,
126+h5:hover > a.headerlink,
127+h6:hover > a.headerlink,
128+dt:hover > a.headerlink,
129+caption:hover > a.headerlink,
130+p.caption:hover > a.headerlink,
131+div.code-block-caption:hover > a.headerlink {
132+ visibility: visible;
133+}
134+
135+img {
136+ margin-left: 2em;
137+ border: 1px dotted {{ theme_border_color }};
138+}
139+
140+div.admonition {
141+ margin-top: 10px;
142+ margin-bottom: 10px;
143+ padding: 2px 7px 1px 7px;
144+ border-left: 0.2em solid {{ theme_border_color }};
145+}
146+
147+p.admonition-title {
148+ margin: 0px 10px 5px 0px;
149+ font-weight: bold;
150+}
151+
152+dt:target, .highlighted {
153+ background-color: #fbe54e;
154+}
155+
156+table {
157+ font-family: "Liberation Sans", sans-serif;
158+ border-top: 2px solid #814324;
159+ border-bottom: 2px solid #814324;
160+ border-left: 1px dotted #814324;
161+ border-right: 1px dotted #814324;
162+ border-collapse: collapse;
163+ font-size: 0.9em;
164+ color: #814324;
165+ vertical-align: top;
166+ line-height: 120%;
167+ }
168+td {
169+ color: black;
170+ text-align: left;
171+ padding-left: 10px;
172+ padding-right: 10px;
173+ padding-top: 4px;
174+ padding-bottom: 4px;
175+ border-right: 1px dotted #814324;
176+ }
177+th {
178+ padding: 6px 6px;
179+ text-align: center;
180+ color: #814324;
181+ background-color: #e4d798;
182+ border-right: 1px dotted #814324;
183+ }
184+tr.hdr {
185+ font-weight: bold;
186+ }
187+thead tr {
188+ border-bottom: 1px solid #814324;
189+ background-color: #F3F1E2;
190+ }
191+tbody tr {
192+ border-bottom: 1px dotted #814324;
193+ }
194+
195+/* Header */
196+
197+div.header {
198+ position: absolute;
199+ top: 0; left: 10%; width: 65%;
200+ background-color: #FFE4B5;
201+ background-color: rgb(228, 215, 152);
202+ border-bottom: 4px solid #814324;
203+ border-left: 4px solid #814324;
204+ border-right: 4px solid #814324;
205+ height: 5em;
206+ }
207+
208+div.headertitle {
209+ position: relative;
210+ top: 0.5em; left: 5%; width: 75%;
211+ padding: 0.5em 3em 0.5em 0em;
212+ text-align: center;
213+ font-family: "Liberation Sans", "DejaVu Sans", "Bitstream Vera Sans", Arial, Helvetica, sans-serif;
214+ font-size: 1.5em;
215+ font-weight: bold;
216+ word-spacing: 0.05em;
217+ color: #814324;
218+ background-color: rgb(250, 250, 210);
219+ border: 1px solid #814324;
220+ opacity: 0.8;
221+ }
222+
223+
224+div.header .headertitle a {
225+ color: #814324;
226+}
227+
228+div.header div.rel {
229+ margin-top: 1em;
230+}
231+
232+div.header div.rel a {
233+ color: {{ theme_headerlinkcolor }};
234+ letter-spacing: .1em;
235+ text-transform: uppercase;
236+}
237+
238+p.logo {
239+ float: right;
240+}
241+
242+img.logo {
243+ border: 0;
244+}
245+
246+
247+/* Content */
248+div.content-wrapper {
249+ background-color: #f2ecd1;
250+ padding-bottom: 20px;
251+}
252+
253+div.document {
254+ width: {{ theme_documentwidth }};
255+ float: left;
256+ background-color: #f2ecd1;
257+}
258+
259+div.body {
260+ padding-right: 2em;
261+ text-align: {{ theme_textalign }};
262+}
263+
264+div.document h1 {
265+ line-height: 120%;
266+}
267+
268+div.document ul {
269+ margin: 1.5em;
270+ list-style-type: disc;
271+}
272+
273+div.document li::marker {
274+ color: #814324;
275+ }
276+
277+div.document dd {
278+ margin-left: 1.2em;
279+ margin-top: .4em;
280+ margin-bottom: 1em;
281+}
282+
283+div.document .section {
284+ margin-top: 1.7em;
285+}
286+div.document .section:first-child {
287+ margin-top: 0px;
288+}
289+
290+div.document div.highlight {
291+ margin: 0.5em 1em;
292+ padding: 0.5em 1.0em;
293+ border: 1px dotted rgb(123, 108, 34);
294+ min-width: 40em;
295+ max-width: 80%;
296+ overflow-x: auto;
297+ font-family: "Liberation Mono", "DejaVu Sans Mono", "Bitstream Vera Sans Mono", "Lucida Console", "Courier New", Courier, fixed;
298+ font-size: 1.0em;
299+ line-height: 110%;
300+ background-color: #F5F4F0;
301+ color: #4C1200;
302+ white-space: pre;
303+ display: inline-block;
304+ }
305+
306+div.document div.literal-block-wrapper {
307+ margin-top: .8em;
308+ margin-bottom: .8em;
309+}
310+
311+div.document div.literal-block-wrapper div.highlight {
312+ margin: 0;
313+}
314+
315+div.document div.code-block-caption span.caption-number {
316+ padding: 0.1em 0.3em;
317+ font-style: italic;
318+}
319+
320+div.document div.code-block-caption span.caption-text {
321+}
322+
323+div.document h2 {
324+ margin-top: .7em;
325+}
326+
327+div.document p {
328+ margin-bottom: .5em;
329+}
330+
331+div.document li.toctree-l1 {
332+ margin-bottom: 1em;
333+}
334+
335+div.document .descname {
336+ font-weight: bold;
337+}
338+
339+div.document .sig-paren {
340+ font-size: larger;
341+}
342+
343+div.document .docutils.literal {
344+ background-color: #eeeeec;
345+ padding: 1px;
346+}
347+
348+div.document .docutils.xref.literal {
349+ background-color: transparent;
350+ padding: 0px;
351+}
352+
353+div.document blockquote {
354+ margin: 1em;
355+}
356+
357+div.document ol {
358+ margin: 1.5em;
359+}
360+
361+
362+/* Sidebar */
363+
364+div.sidebar {
365+ width: {{ theme_sidebarwidth }};
366+ float: right;
367+ font-family: "Liberation Sans", sans-serif;
368+ font-size: .9em;
369+ line-height: 120%;
370+ border: 1px solid #814324;
371+ color: #814324;
372+ /*background-color: #f2ecd1;*/
373+ /*background-color: #f3f1e2;*/
374+ background-color: #f3efda;
375+ box-shadow: 5px 5px 3px #AAAAAA;
376+}
377+
378+div.sidebar a, div.header a {
379+ text-decoration: none;
380+}
381+
382+div.sidebar a:hover, div.header a:hover {
383+ text-decoration: underline;
384+ color: #4e2a16;
385+}
386+
387+
388+div.sidebar h3 {
389+ font-family: "Liberation Sans", sans-serif;
390+ color: #814324;
391+ font-weight: bold;
392+ background-color: #e4d798;
393+ border-bottom: 1px solid #814324;
394+ border-top: 1px solid #814324;
395+ margin-bottom: 5px;
396+ padding-left: 5px;
397+ padding-top: 5px;
398+ padding-bottom: 3px;
399+ letter-spacing: .1em;
400+}
401+
402+div.sidebar ul {
403+ list-style-type: none;
404+ margin-left: 3px;
405+}
406+
407+div.sidebar li.toctree-l1 a {
408+ display: block;
409+ color: #814324;
410+ background-color: transparent;
411+ margin-left: 3px;
412+ padding-left: 3px;
413+}
414+
415+div.sidebar li.toctree-l2 a {
416+ color: #814324;
417+ background-color: transparent;
418+ border: none;
419+ margin-left: 1em;
420+}
421+
422+div.sidebar li.toctree-l3 a {
423+ color: #814324;
424+ background-color: transparent;
425+ border: none;
426+ margin-left: 2em;
427+}
428+
429+div.sidebar li.toctree-l2:last-child a {
430+ border-bottom: none;
431+}
432+
433+div.sidebar input[type="text"] {
434+ font-size: 0.9em;
435+ width: 150px;
436+ margin-left: 6px;
437+ margin-bottom: 6px;
438+}
439+
440+div.sidebar input[type="submit"] {
441+ font-size: 0.9em;
442+ width: auto;
443+ padding-left: 3px;
444+ padding-right: 3px;
445+ text-align: center;
446+ margin-bottom: 6px;
447+}
448+
449+div.sidebar div.sidebarnavlinks {
450+ padding: 0.5em 1.0em;
451+ font-size: 0.9em;
452+ }
453+
454+
455+/* Footer */
456+
457+div.footer-wrapper {
458+ background: #e4d798;
459+ border-top: 4px solid {{ theme_border_color }};
460+ border-bottom: 2px solid {{ theme_border_color }};
461+ padding-top: 10px;
462+ padding-bottom: 10px;
463+ min-height: 2em;
464+}
465+
466+div.footer, div.footer a {
467+ background: #e4d798;
468+ color: #888a85;
469+}
470+
471+div.footer .right {
472+ text-align: right;
473+}
474+
475+div.footer .left {
476+ text-transform: uppercase;
477+}
478+
479+
480+/* Styles copied from basic theme */
481+
482+img.align-left, .figure.align-left, object.align-left {
483+ clear: left;
484+ float: left;
485+ margin-right: 1em;
486+}
487+
488+img.align-right, .figure.align-right, object.align-right {
489+ clear: right;
490+ float: right;
491+ margin-left: 1em;
492+}
493+
494+img.align-center, .figure.align-center, object.align-center {
495+ display: block;
496+ margin-left: auto;
497+ margin-right: auto;
498+}
499+
500+.align-left {
501+ text-align: left;
502+}
503+
504+.align-center {
505+ text-align: center;
506+}
507+
508+.align-right {
509+ text-align: right;
510+}
511+
512+table caption span.caption-number {
513+ font-style: italic;
514+}
515+
516+table caption span.caption-text {
517+}
518+
519+div.figure p.caption span.caption-number {
520+ font-style: italic;
521+}
522+
523+div.figure p.caption span.caption-text {
524+}
525+
526+/* -- search page ----------------------------------------------------------- */
527+
528+ul.search {
529+ margin: 10px 0 0 20px;
530+ padding: 0;
531+}
532+
533+ul.search li {
534+ padding: 5px 0 5px 20px;
535+ background-image: url(file.png);
536+ background-repeat: no-repeat;
537+ background-position: 0 7px;
538+}
539+
540+ul.search li a {
541+ font-weight: bold;
542+}
543+
544+ul.search li div.context {
545+ color: #888;
546+ margin: 2px 0 0 30px;
547+ text-align: left;
548+}
549+
550+ul.keywordmatches li.goodmatch a {
551+ font-weight: bold;
552+}
553+
554+/* -- index page ------------------------------------------------------------ */
555+
556+table.contentstable {
557+ width: 90%;
558+}
559+
560+table.contentstable p.biglink {
561+ line-height: 150%;
562+}
563+
564+a.biglink {
565+ font-size: 1.3em;
566+}
567+
568+span.linkdescr {
569+ font-style: italic;
570+ padding-top: 5px;
571+ font-size: 90%;
572+}
573+
574+/* -- general index --------------------------------------------------------- */
575+
576+table.indextable td {
577+ text-align: left;
578+ vertical-align: top;
579+}
580+
581+table.indextable ul {
582+ margin-top: 0;
583+ margin-bottom: 0;
584+ list-style-type: none;
585+}
586+
587+table.indextable > tbody > tr > td > ul {
588+ padding-left: 0em;
589+}
590+
591+table.indextable tr.pcap {
592+ height: 10px;
593+}
594+
595+table.indextable tr.cap {
596+ margin-top: 10px;
597+ background-color: #f2f2f2;
598+}
599+
600+img.toggler {
601+ margin-right: 3px;
602+ margin-top: 3px;
603+ cursor: pointer;
604+}
605+
606+/* -- domain module index --------------------------------------------------- */
607+
608+table.modindextable td {
609+ padding: 2px;
610+ border-collapse: collapse;
611+}
612+
613+/* -- viewcode extension ---------------------------------------------------- */
614+
615+.viewcode-link {
616+ float: right;
617+}
618+
619+.viewcode-back {
620+ float: right;
621+ font-family: {{ theme_bodyfont }};
622+}
623+
624+div.viewcode-block:target {
625+ margin: -1px -3px;
626+ padding: 0.5em 1.0em;
627+ background-color: #F5F4F0;
628+ border: 1px dotted {{ theme_border_color }};
629+}
630+
631+div.code-block-caption {
632+ background-color: #ddd;
633+ color: #333;
634+ padding: 2px 5px;
635+ font-size: small;
636+}
637+
638+/* -- math display ---------------------------------------------------------- */
639+
640+div.body div.math p {
641+ text-align: center;
642+}
643+
644+span.eqno {
645+ float: right;
646+}
647+
648+
649+/* -- other customization --------------------------------------------------- */
650+
651+dt {
652+ font-family: "Liberation Sans", "DejaVu Sans", "Bitstream Vera Sans", Arial, Helvetica, sans-serif;
653+ }
654+
diff -r 000000000000 -r d4faed2f5638 doc/source/themes/sunwood/theme.conf
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/doc/source/themes/sunwood/theme.conf Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,19 @@
1+[theme]
2+inherit = basic
3+stylesheet = sunwood.css
4+pygments_style = tango
5+
6+[options]
7+bodyfont = "Liberation Sans Regular", "Georgia", "Times New Roman", serif
8+headerfont = "Liberation Sans", "Verdana", Arial, sans-serif
9+pagewidth = 70em
10+documentwidth = 50em
11+sidebarwidth = 20em
12+bgcolor = #f3f1e2
13+headerbg = #f3f1e2
14+border_color = #814324
15+linkcolor = #4C1200
16+headercolor1 = #814324
17+headercolor2 = #814324
18+headerlinkcolor = #4C1200
19+textalign = justify
diff -r 000000000000 -r d4faed2f5638 setup.py
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/setup.py Sat Jul 28 16:17:46 2018 -0700
@@ -0,0 +1,24 @@
1+from distutils.core import setup
2+
3+setup(name='chkcsv',
4+ version='0.8.0.1',
5+ description="Check the format of a CSV file",
6+ author='Dreas Nielsen',
7+ author_email='dreas.nielsen@gmail.com',
8+ url='none',
9+ scripts=['chkcsv/chkcsv.py'],
10+ classifiers=[
11+ 'Development Status :: 5 - Production/Stable',
12+ 'Environment :: Console',
13+ 'Intended Audience :: End Users/Desktop',
14+ 'License :: OSI Approved :: GNU General Public License (GPL)',
15+ 'Natural Language :: English',
16+ 'Operating System :: OS Independent',
17+ 'Topic :: Text Processing :: General',
18+ 'Topic :: Office/Business'
19+ ],
20+ long_description="""``chkcsv.py`` is a Python module and program
21+that checks the format of data in a CSV file. It can check whether required
22+columns and data are present, and the type of data in each column. Pattern
23+matching using regular expressions is supported."""
24+ )
Show on old repository browser