OmniSciDB  a5dc49c757
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
FixedLengthArrayNoneEncoder Class Reference

#include <FixedLengthArrayNoneEncoder.h>

+ Inheritance diagram for FixedLengthArrayNoneEncoder:
+ Collaboration diagram for FixedLengthArrayNoneEncoder:

Public Member Functions

 FixedLengthArrayNoneEncoder (AbstractBuffer *buffer, size_t as)
 
size_t getNumElemsForBytesEncodedDataAtIndices (const int8_t *index_data, const std::vector< size_t > &selected_idx, const size_t byte_limit) override
 
size_t getNumElemsForBytesInsertData (const std::vector< ArrayDatum > *srcData, const int start_idx, const size_t numAppendElems, const size_t byteLimit, const bool replicating=false)
 
std::shared_ptr< ChunkMetadataappendEncodedDataAtIndices (const int8_t *index_data, int8_t *data, const std::vector< size_t > &selected_idx) override
 
std::shared_ptr< ChunkMetadataappendEncodedData (const int8_t *index_data, int8_t *data, const size_t start_idx, const size_t num_elements) override
 
std::shared_ptr< ChunkMetadataappendData (int8_t *&src_data, const size_t num_elems_to_append, const SQLTypeInfo &ti, const bool replicating=false, const int64_t offset=-1) override
 
std::shared_ptr< ChunkMetadataappendData (const std::vector< ArrayDatum > *srcData, const int start_idx, const size_t numAppendElems, const bool replicating=false)
 
void getMetadata (const std::shared_ptr< ChunkMetadata > &chunkMetadata) override
 
std::shared_ptr< ChunkMetadatagetMetadata (const SQLTypeInfo &ti) override
 
void updateStats (const int64_t, const bool) override
 
void updateStats (const double, const bool) override
 
void reduceStats (const Encoder &) override
 
void updateStats (const int8_t *const src_data, const size_t num_elements) override
 
void updateStats (const std::vector< std::string > *const src_data, const size_t start_idx, const size_t num_elements) override
 
void updateStats (const std::vector< ArrayDatum > *const src_data, const size_t start_idx, const size_t num_elements) override
 
void writeMetadata (FILE *f) override
 
void readMetadata (FILE *f) override
 
void copyMetadata (const Encoder *copyFromEncoder) override
 
void updateMetadata (int8_t *array)
 
bool resetChunkStats (const ChunkStats &stats) override
 : Reset chunk level stats (min, max, nulls) using new values from the argument. More...
 
void resetChunkStats () override
 
- Public Member Functions inherited from Encoder
 Encoder (Data_Namespace::AbstractBuffer *buffer)
 
virtual ~Encoder ()
 
virtual void updateStatsEncoded (const int8_t *const dst_data, const size_t num_elements)
 
size_t getNumElems () const
 
void setNumElems (const size_t num_elems)
 

Static Public Member Functions

static bool is_null_ignore_not_null (const SQLTypeInfo &type, int8_t *array)
 
static bool is_null (const SQLTypeInfo &type, int8_t *array)
 
- Static Public Member Functions inherited from Encoder
static EncoderCreate (Data_Namespace::AbstractBuffer *buffer, const SQLTypeInfo sqlType)
 

Public Attributes

Datum elem_min
 
Datum elem_max
 
bool has_nulls
 
bool initialized
 

Private Member Functions

bool is_null (int8_t *array)
 
bool is_null_ignore_not_null (int8_t *array)
 
void update_elem_stats (const ArrayDatum &array)
 

Private Attributes

std::mutex EncoderMutex_
 
std::mutex print_mutex_
 
size_t array_size
 

Additional Inherited Members

- Protected Attributes inherited from Encoder
size_t num_elems_
 
Data_Namespace::AbstractBufferbuffer_
 
DecimalOverflowValidator decimal_overflow_validator_
 
DateDaysOverflowValidator date_days_overflow_validator_
 

Detailed Description

Definition at line 40 of file FixedLengthArrayNoneEncoder.h.

Constructor & Destructor Documentation

FixedLengthArrayNoneEncoder::FixedLengthArrayNoneEncoder ( AbstractBuffer buffer,
size_t  as 
)
inline

Definition at line 42 of file FixedLengthArrayNoneEncoder.h.

43  : Encoder(buffer), has_nulls(false), initialized(false), array_size(as) {}
Encoder(Data_Namespace::AbstractBuffer *buffer)
Definition: Encoder.cpp:225

Member Function Documentation

std::shared_ptr<ChunkMetadata> FixedLengthArrayNoneEncoder::appendData ( int8_t *&  src_data,
const size_t  num_elems_to_append,
const SQLTypeInfo ti,
const bool  replicating = false,
const int64_t  offset = -1 
)
inlineoverridevirtual

Append data to the chunk buffer backing this encoder.

Parameters
src_dataSource data for the append
num_elems_to_appendNumber of elements to append
tiSQL Type Info for the column TODO(adb): used?
replicatingPass one value and fill the chunk with it
offsetWrite data starting at a given offset. Default is -1 which indicates an append, an offset of 0 rewrites the chunk up to num_elems_to_append.

Implements Encoder.

Definition at line 97 of file FixedLengthArrayNoneEncoder.h.

References UNREACHABLE.

Referenced by Chunk_NS::Chunk::appendData(), appendEncodedData(), and appendEncodedDataAtIndices().

101  {
102  UNREACHABLE(); // should never be called for arrays
103  return nullptr;
104  }
#define UNREACHABLE()
Definition: Logger.h:338

+ Here is the caller graph for this function:

std::shared_ptr<ChunkMetadata> FixedLengthArrayNoneEncoder::appendData ( const std::vector< ArrayDatum > *  srcData,
const int  start_idx,
const size_t  numAppendElems,
const bool  replicating = false 
)
inline

Definition at line 106 of file FixedLengthArrayNoneEncoder.h.

References Data_Namespace::AbstractBuffer::append(), array_size, Encoder::buffer_, CHECK_EQ, getMetadata(), Data_Namespace::AbstractBuffer::isDirty(), Encoder::num_elems_, Data_Namespace::AbstractBuffer::reserve(), Data_Namespace::AbstractBuffer::setDirty(), and updateStats().

109  {
110  const size_t existing_data_size = num_elems_ * array_size;
111  const size_t append_data_size = array_size * numAppendElems;
112  buffer_->reserve(existing_data_size + append_data_size);
113  std::vector<int8_t> append_buffer(append_data_size);
114  int8_t* append_ptr = append_buffer.data();
115 
116  // There was some worry about the change implemented to write the append data to an
117  // intermediate buffer, but testing on import and ctas of 20M points, we never append
118  // more than 1.6MB and 1MB of data at a time, respectively, so at least for fixed
119  // length types this should not be an issue (varlen types, which can be massive even
120  // for a single field/row, are a different story however)
121 
122  if (replicating) {
123  const size_t len = (*srcData)[0].length;
124  CHECK_EQ(len, array_size);
125  const int8_t* replicated_ptr = (*srcData)[0].pointer;
126  for (size_t i = 0; i < numAppendElems; ++i) {
127  std::memcpy(append_ptr + i * array_size, replicated_ptr, array_size);
128  }
129  } else {
130  for (size_t i = 0; i < numAppendElems; ++i) {
131  // Length of the appended array should be equal to the fixed length,
132  // all others should have been discarded, assert if something slips through
133  const size_t source_idx = start_idx + i;
134  const size_t len = (*srcData)[source_idx].length;
135  CHECK_EQ(len, array_size);
136  // NULL arrays have been filled with subtype's NULL sentinels,
137  // should be appended as regular data, same size
138  std::memcpy(
139  append_ptr + i * array_size, (*srcData)[source_idx].pointer, array_size);
140  }
141  }
142 
143  buffer_->append(append_ptr, append_data_size);
144 
145  if (replicating) {
146  updateStats(srcData, 0, 1);
147  } else {
148  updateStats(srcData, start_idx, numAppendElems);
149  }
150 
151  // make sure buffer_ is flushed even if no new data is appended to it
152  // (e.g. empty strings) because the metadata needs to be flushed.
153  if (!buffer_->isDirty()) {
154  buffer_->setDirty();
155  }
156 
157  num_elems_ += numAppendElems;
158  auto chunk_metadata = std::make_shared<ChunkMetadata>();
159  getMetadata(chunk_metadata);
160  return chunk_metadata;
161  }
#define CHECK_EQ(x, y)
Definition: Logger.h:301
size_t num_elems_
Definition: Encoder.h:288
void getMetadata(const std::shared_ptr< ChunkMetadata > &chunkMetadata) override
Data_Namespace::AbstractBuffer * buffer_
Definition: Encoder.h:290
virtual void append(int8_t *src, const size_t num_bytes, const MemoryLevel src_buffer_type=CPU_LEVEL, const int device_id=-1)=0
void updateStats(const int64_t, const bool) override
virtual void reserve(size_t num_bytes)=0

+ Here is the call graph for this function:

std::shared_ptr<ChunkMetadata> FixedLengthArrayNoneEncoder::appendEncodedData ( const int8_t *  index_data,
int8_t *  data,
const size_t  start_idx,
const size_t  num_elements 
)
inlineoverridevirtual

Append encoded data to the chunk buffer backing this encoder.

Parameters
index_data- (optional) the index data of data to append
data- the data to append
start_idx- the position to start encoding from in the data array
num_elements- the number of elements to encode from the data array
Returns
updated chunk metadata for the chunk buffer backing this encoder

NOTE: index_data must be non-null for varlen encoder types.

Implements Encoder.

Definition at line 83 of file FixedLengthArrayNoneEncoder.h.

References appendData(), and array_size.

86  {
87  std::vector<ArrayDatum> data_subset;
88  data_subset.reserve(num_elements);
89  for (size_t count = 0; count < num_elements; ++count) {
90  auto current_data = data + array_size * (start_idx + count);
91  data_subset.emplace_back(
92  ArrayDatum(array_size, current_data, false, DoNothingDeleter{}));
93  }
94  return appendData(&data_subset, 0, num_elements, false);
95  }
std::conditional_t< is_cuda_compiler(), DeviceArrayDatum, HostArrayDatum > ArrayDatum
Definition: sqltypes.h:229
std::shared_ptr< ChunkMetadata > appendData(int8_t *&src_data, const size_t num_elems_to_append, const SQLTypeInfo &ti, const bool replicating=false, const int64_t offset=-1) override

+ Here is the call graph for this function:

std::shared_ptr<ChunkMetadata> FixedLengthArrayNoneEncoder::appendEncodedDataAtIndices ( const int8_t *  index_data,
int8_t *  data,
const std::vector< size_t > &  selected_idx 
)
inlineoverridevirtual

Append selected encoded data to the chunk buffer backing this encoder.

Parameters
index_data- (optional) the index data of data to append
data- the data to append
selected_idx- which indices in the encoded data to append
Returns
updated chunk metadata for the chunk buffer backing this encoder

NOTE: index_data must be non-null for varlen encoder types.

Implements Encoder.

Definition at line 67 of file FixedLengthArrayNoneEncoder.h.

References appendData(), array_size, and is_null_ignore_not_null().

70  {
71  std::vector<ArrayDatum> data_subset;
72  data_subset.reserve(selected_idx.size());
73  for (const auto& index : selected_idx) {
74  auto current_data = data + array_size * (index);
75  data_subset.emplace_back(ArrayDatum(array_size,
76  current_data,
77  is_null_ignore_not_null(current_data),
78  DoNothingDeleter{}));
79  }
80  return appendData(&data_subset, 0, selected_idx.size(), false);
81  }
std::conditional_t< is_cuda_compiler(), DeviceArrayDatum, HostArrayDatum > ArrayDatum
Definition: sqltypes.h:229
std::shared_ptr< ChunkMetadata > appendData(int8_t *&src_data, const size_t num_elems_to_append, const SQLTypeInfo &ti, const bool replicating=false, const int64_t offset=-1) override
static bool is_null_ignore_not_null(const SQLTypeInfo &type, int8_t *array)

+ Here is the call graph for this function:

void FixedLengthArrayNoneEncoder::copyMetadata ( const Encoder copyFromEncoder)
inlineoverridevirtual

Implements Encoder.

Definition at line 217 of file FixedLengthArrayNoneEncoder.h.

References elem_max, elem_min, Encoder::getNumElems(), has_nulls, initialized, and Encoder::num_elems_.

217  {
218  num_elems_ = copyFromEncoder->getNumElems();
219  auto array_encoder =
220  dynamic_cast<const FixedLengthArrayNoneEncoder*>(copyFromEncoder);
221  elem_min = array_encoder->elem_min;
222  elem_max = array_encoder->elem_max;
223  has_nulls = array_encoder->has_nulls;
224  initialized = array_encoder->initialized;
225  }
size_t num_elems_
Definition: Encoder.h:288
size_t getNumElems() const
Definition: Encoder.h:284

+ Here is the call graph for this function:

void FixedLengthArrayNoneEncoder::getMetadata ( const std::shared_ptr< ChunkMetadata > &  chunkMetadata)
inlineoverridevirtual

Reimplemented from Encoder.

Definition at line 163 of file FixedLengthArrayNoneEncoder.h.

References elem_max, elem_min, Encoder::getMetadata(), and has_nulls.

Referenced by appendData().

163  {
164  Encoder::getMetadata(chunkMetadata); // call on parent class
165  chunkMetadata->fillChunkStats(elem_min, elem_max, has_nulls);
166  }
virtual void getMetadata(const std::shared_ptr< ChunkMetadata > &chunkMetadata)
Definition: Encoder.cpp:231

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

std::shared_ptr<ChunkMetadata> FixedLengthArrayNoneEncoder::getMetadata ( const SQLTypeInfo ti)
inlineoverridevirtual

Implements Encoder.

Definition at line 169 of file FixedLengthArrayNoneEncoder.h.

References elem_max, elem_min, and has_nulls.

169  {
170  auto chunk_metadata = std::make_shared<ChunkMetadata>(
171  ti, 0, 0, ChunkStats{elem_min, elem_max, has_nulls});
172  return chunk_metadata;
173  }
size_t FixedLengthArrayNoneEncoder::getNumElemsForBytesEncodedDataAtIndices ( const int8_t *  index_data,
const std::vector< size_t > &  selected_idx,
const size_t  byte_limit 
)
inlineoverridevirtual

Compute the maximum number of variable length encoded elements given a byte limit

Parameters
index_data- (optional) index data for the encoded type
selected_idx- which indices in the encoded data to consider
byte_limit- byte limit that must be respected
Returns
the number of elements

NOTE: optional parameters above may be ignored by the implementation, but may or may not be required depending on the encoder type backing the implementation.

Implements Encoder.

Definition at line 45 of file FixedLengthArrayNoneEncoder.h.

References array_size.

47  {
48  size_t data_size = selected_idx.size() * array_size;
49  if (data_size > byte_limit) {
50  data_size = byte_limit;
51  }
52  return data_size / array_size;
53  }
size_t FixedLengthArrayNoneEncoder::getNumElemsForBytesInsertData ( const std::vector< ArrayDatum > *  srcData,
const int  start_idx,
const size_t  numAppendElems,
const size_t  byteLimit,
const bool  replicating = false 
)
inline

Definition at line 55 of file FixedLengthArrayNoneEncoder.h.

References array_size.

Referenced by Chunk_NS::Chunk::getNumElemsForBytesInsertData().

59  {
60  size_t dataSize = numAppendElems * array_size;
61  if (dataSize > byteLimit) {
62  dataSize = byteLimit;
63  }
64  return dataSize / array_size;
65  }

+ Here is the caller graph for this function:

static bool FixedLengthArrayNoneEncoder::is_null ( const SQLTypeInfo type,
int8_t *  array 
)
inlinestatic

Definition at line 281 of file FixedLengthArrayNoneEncoder.h.

References SQLTypeInfo::get_notnull(), and is_null_ignore_not_null().

Referenced by Fragmenter_Namespace::FixedLenArrayChunkConverter::convertToColumnarFormat(), and updateMetadata().

281  {
282  if (type.get_notnull()) {
283  return false;
284  }
285  return is_null_ignore_not_null(type, array);
286  }
HOST DEVICE bool get_notnull() const
Definition: sqltypes.h:398
static bool is_null_ignore_not_null(const SQLTypeInfo &type, int8_t *array)

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

bool FixedLengthArrayNoneEncoder::is_null ( int8_t *  array)
inlineprivate

Definition at line 315 of file FixedLengthArrayNoneEncoder.h.

References Encoder::buffer_, Data_Namespace::AbstractBuffer::getSqlType(), and is_null().

Referenced by is_null().

315 { return is_null(buffer_->getSqlType(), array); }
Data_Namespace::AbstractBuffer * buffer_
Definition: Encoder.h:290
SQLTypeInfo getSqlType() const
static bool is_null(const SQLTypeInfo &type, int8_t *array)

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

static bool FixedLengthArrayNoneEncoder::is_null_ignore_not_null ( const SQLTypeInfo type,
int8_t *  array 
)
inlinestatic

Definition at line 231 of file FixedLengthArrayNoneEncoder.h.

References CHECK_EQ, SQLTypeInfo::get_compression(), SQLTypeInfo::get_subtype(), kBIGINT, kBOOLEAN, kCHAR, kDATE, kDECIMAL, kDOUBLE, kENCODING_DICT, kFLOAT, kINT, kNUMERIC, kSMALLINT, kTEXT, kTIME, kTIMESTAMP, kTINYINT, kVARCHAR, NULL_ARRAY_BIGINT, NULL_ARRAY_BOOLEAN, NULL_ARRAY_DOUBLE, NULL_ARRAY_FLOAT, NULL_ARRAY_INT, NULL_ARRAY_SMALLINT, NULL_ARRAY_TINYINT, and UNREACHABLE.

Referenced by appendEncodedDataAtIndices(), is_null(), and is_null_ignore_not_null().

231  {
232  switch (type.get_subtype()) {
233  case kBOOLEAN: {
234  return (array[0] == NULL_ARRAY_BOOLEAN);
235  }
236  case kINT: {
237  const int32_t* int_array = (int32_t*)array;
238  return (int_array[0] == NULL_ARRAY_INT);
239  }
240  case kSMALLINT: {
241  const int16_t* smallint_array = (int16_t*)array;
242  return (smallint_array[0] == NULL_ARRAY_SMALLINT);
243  }
244  case kTINYINT: {
245  const int8_t* tinyint_array = (int8_t*)array;
246  return (tinyint_array[0] == NULL_ARRAY_TINYINT);
247  }
248  case kBIGINT:
249  case kNUMERIC:
250  case kDECIMAL: {
251  const int64_t* bigint_array = (int64_t*)array;
252  return (bigint_array[0] == NULL_ARRAY_BIGINT);
253  }
254  case kFLOAT: {
255  const float* flt_array = (float*)array;
256  return (flt_array[0] == NULL_ARRAY_FLOAT);
257  }
258  case kDOUBLE: {
259  const double* dbl_array = (double*)array;
260  return (dbl_array[0] == NULL_ARRAY_DOUBLE);
261  }
262  case kTIME:
263  case kTIMESTAMP:
264  case kDATE: {
265  const int64_t* tm_array = reinterpret_cast<int64_t*>(array);
266  return (tm_array[0] == NULL_ARRAY_BIGINT);
267  }
268  case kCHAR:
269  case kVARCHAR:
270  case kTEXT: {
272  const int32_t* int_array = (int32_t*)array;
273  return (int_array[0] == NULL_ARRAY_INT);
274  }
275  default:
276  UNREACHABLE();
277  }
278  return false;
279  }
HOST DEVICE SQLTypes get_subtype() const
Definition: sqltypes.h:392
#define CHECK_EQ(x, y)
Definition: Logger.h:301
Definition: sqltypes.h:76
#define NULL_ARRAY_INT
#define NULL_ARRAY_SMALLINT
#define UNREACHABLE()
Definition: Logger.h:338
#define NULL_ARRAY_TINYINT
#define NULL_ARRAY_FLOAT
Definition: sqltypes.h:79
Definition: sqltypes.h:80
HOST DEVICE EncodingType get_compression() const
Definition: sqltypes.h:399
Definition: sqltypes.h:68
#define NULL_ARRAY_DOUBLE
#define NULL_ARRAY_BIGINT
Definition: sqltypes.h:72
#define NULL_ARRAY_BOOLEAN

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

bool FixedLengthArrayNoneEncoder::is_null_ignore_not_null ( int8_t *  array)
inlineprivate

Definition at line 317 of file FixedLengthArrayNoneEncoder.h.

References Encoder::buffer_, Data_Namespace::AbstractBuffer::getSqlType(), and is_null_ignore_not_null().

317  {
318  return is_null_ignore_not_null(buffer_->getSqlType(), array);
319  }
Data_Namespace::AbstractBuffer * buffer_
Definition: Encoder.h:290
SQLTypeInfo getSqlType() const
static bool is_null_ignore_not_null(const SQLTypeInfo &type, int8_t *array)

+ Here is the call graph for this function:

void FixedLengthArrayNoneEncoder::readMetadata ( FILE *  f)
inlineoverridevirtual

Implements Encoder.

Definition at line 208 of file FixedLengthArrayNoneEncoder.h.

References elem_max, elem_min, has_nulls, initialized, and Encoder::num_elems_.

208  {
209  // assumes pointer is already in right place
210  fread((int8_t*)&num_elems_, sizeof(size_t), 1, f);
211  fread((int8_t*)&elem_min, sizeof(Datum), 1, f);
212  fread((int8_t*)&elem_max, sizeof(Datum), 1, f);
213  fread((int8_t*)&has_nulls, sizeof(bool), 1, f);
214  fread((int8_t*)&initialized, sizeof(bool), 1, f);
215  }
size_t num_elems_
Definition: Encoder.h:288
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)
Definition: Datum.h:71
void FixedLengthArrayNoneEncoder::reduceStats ( const Encoder )
inlineoverridevirtual

Implements Encoder.

Definition at line 179 of file FixedLengthArrayNoneEncoder.h.

References CHECK.

179 { CHECK(false); }
#define CHECK(condition)
Definition: Logger.h:291
bool FixedLengthArrayNoneEncoder::resetChunkStats ( const ChunkStats )
inlineoverridevirtual

: Reset chunk level stats (min, max, nulls) using new values from the argument.

Returns
: True if an update occurred and the chunk needs to be flushed. False otherwise. Default false if metadata update is unsupported. Only reset chunk stats if the incoming stats differ from the current stats.

Reimplemented from Encoder.

Definition at line 288 of file FixedLengthArrayNoneEncoder.h.

References Encoder::buffer_, DatumEqual(), elem_max, elem_min, SQLTypeInfo::get_elem_type(), Data_Namespace::AbstractBuffer::getSqlType(), ChunkStats::has_nulls, has_nulls, initialized, ChunkStats::max, and ChunkStats::min.

288  {
289  auto elem_type = buffer_->getSqlType().get_elem_type();
290  if (initialized && DatumEqual(elem_min, stats.min, elem_type) &&
291  DatumEqual(elem_max, stats.max, elem_type) && has_nulls == stats.has_nulls) {
292  return false;
293  }
294  elem_min = stats.min;
295  elem_max = stats.max;
296  has_nulls = stats.has_nulls;
297  return true;
298  }
dictionary stats
Definition: report.py:116
Data_Namespace::AbstractBuffer * buffer_
Definition: Encoder.h:290
bool DatumEqual(const Datum a, const Datum b, const SQLTypeInfo &ti)
Definition: Datum.cpp:408
SQLTypeInfo getSqlType() const
SQLTypeInfo get_elem_type() const
Definition: sqltypes.h:977

+ Here is the call graph for this function:

void FixedLengthArrayNoneEncoder::resetChunkStats ( )
inlineoverridevirtual

Resets chunk metadata stats to their default values.

Implements Encoder.

Definition at line 300 of file FixedLengthArrayNoneEncoder.h.

References has_nulls, and initialized.

void FixedLengthArrayNoneEncoder::update_elem_stats ( const ArrayDatum array)
inlineprivate

Definition at line 321 of file FixedLengthArrayNoneEncoder.h.

References Datum::bigintval, Datum::boolval, Encoder::buffer_, CHECK_EQ, Encoder::decimal_overflow_validator_, Datum::doubleval, elem_max, elem_min, Datum::floatval, SQLTypeInfo::get_compression(), SQLTypeInfo::get_subtype(), Data_Namespace::AbstractBuffer::getSqlType(), has_nulls, initialized, Datum::intval, kBIGINT, kBOOLEAN, kCHAR, kDATE, kDECIMAL, kDOUBLE, kENCODING_DICT, kFLOAT, kINT, kNUMERIC, kSMALLINT, kTEXT, kTIME, kTIMESTAMP, kTINYINT, kVARCHAR, NULL_BIGINT, NULL_BOOLEAN, NULL_DOUBLE, NULL_FLOAT, NULL_INT, NULL_SMALLINT, NULL_TINYINT, Datum::smallintval, Datum::tinyintval, UNREACHABLE, and DecimalOverflowValidator::validate().

Referenced by updateMetadata(), and updateStats().

321  {
322  if (array.is_null) {
323  has_nulls = true;
324  }
325  switch (buffer_->getSqlType().get_subtype()) {
326  case kBOOLEAN: {
327  if (!initialized) {
328  elem_min.boolval = true;
329  elem_max.boolval = false;
330  }
331  if (array.is_null) {
332  break;
333  }
334  const int8_t* bool_array = array.pointer;
335  for (size_t i = 0; i < array.length / sizeof(bool); i++) {
336  if (bool_array[i] == NULL_BOOLEAN) {
337  has_nulls = true;
338  } else if (initialized) {
339  elem_min.boolval = std::min(elem_min.boolval, bool_array[i]);
340  elem_max.boolval = std::max(elem_max.boolval, bool_array[i]);
341  } else {
342  elem_min.boolval = bool_array[i];
343  elem_max.boolval = bool_array[i];
344  initialized = true;
345  }
346  }
347  break;
348  }
349  case kINT: {
350  if (!initialized) {
351  elem_min.intval = 1;
352  elem_max.intval = 0;
353  }
354  if (array.is_null) {
355  break;
356  }
357  const int32_t* int_array = (int32_t*)array.pointer;
358  for (size_t i = 0; i < array.length / sizeof(int32_t); i++) {
359  if (int_array[i] == NULL_INT) {
360  has_nulls = true;
361  } else if (initialized) {
362  elem_min.intval = std::min(elem_min.intval, int_array[i]);
363  elem_max.intval = std::max(elem_max.intval, int_array[i]);
364  } else {
365  elem_min.intval = int_array[i];
366  elem_max.intval = int_array[i];
367  initialized = true;
368  }
369  }
370  break;
371  }
372  case kSMALLINT: {
373  if (!initialized) {
374  elem_min.smallintval = 1;
375  elem_max.smallintval = 0;
376  }
377  if (array.is_null) {
378  break;
379  }
380  const int16_t* smallint_array = (int16_t*)array.pointer;
381  for (size_t i = 0; i < array.length / sizeof(int16_t); i++) {
382  if (smallint_array[i] == NULL_SMALLINT) {
383  has_nulls = true;
384  } else if (initialized) {
385  elem_min.smallintval = std::min(elem_min.smallintval, smallint_array[i]);
386  elem_max.smallintval = std::max(elem_max.smallintval, smallint_array[i]);
387  } else {
388  elem_min.smallintval = smallint_array[i];
389  elem_max.smallintval = smallint_array[i];
390  initialized = true;
391  }
392  }
393  break;
394  }
395  case kTINYINT: {
396  if (!initialized) {
397  elem_min.tinyintval = 1;
398  elem_max.tinyintval = 0;
399  }
400  if (array.is_null) {
401  break;
402  }
403  const int8_t* tinyint_array = (int8_t*)array.pointer;
404  for (size_t i = 0; i < array.length / sizeof(int8_t); i++) {
405  if (tinyint_array[i] == NULL_TINYINT) {
406  has_nulls = true;
407  } else if (initialized) {
408  elem_min.tinyintval = std::min(elem_min.tinyintval, tinyint_array[i]);
409  elem_max.tinyintval = std::max(elem_max.tinyintval, tinyint_array[i]);
410  } else {
411  elem_min.tinyintval = tinyint_array[i];
412  elem_max.tinyintval = tinyint_array[i];
413  initialized = true;
414  }
415  }
416  break;
417  }
418  case kBIGINT:
419  case kNUMERIC:
420  case kDECIMAL: {
421  if (!initialized) {
422  elem_min.bigintval = 1;
423  elem_max.bigintval = 0;
424  }
425  if (array.is_null) {
426  break;
427  }
428  const int64_t* bigint_array = (int64_t*)array.pointer;
429  for (size_t i = 0; i < array.length / sizeof(int64_t); i++) {
430  if (bigint_array[i] == NULL_BIGINT) {
431  has_nulls = true;
432  } else if (initialized) {
433  decimal_overflow_validator_.validate(bigint_array[i]);
434  elem_min.bigintval = std::min(elem_min.bigintval, bigint_array[i]);
435  elem_max.bigintval = std::max(elem_max.bigintval, bigint_array[i]);
436  } else {
437  decimal_overflow_validator_.validate(bigint_array[i]);
438  elem_min.bigintval = bigint_array[i];
439  elem_max.bigintval = bigint_array[i];
440  initialized = true;
441  }
442  }
443  break;
444  }
445  case kFLOAT: {
446  if (!initialized) {
447  elem_min.floatval = 1.0;
448  elem_max.floatval = 0.0;
449  }
450  if (array.is_null) {
451  break;
452  }
453  const float* flt_array = (float*)array.pointer;
454  for (size_t i = 0; i < array.length / sizeof(float); i++) {
455  if (flt_array[i] == NULL_FLOAT) {
456  has_nulls = true;
457  } else if (initialized) {
458  elem_min.floatval = std::min(elem_min.floatval, flt_array[i]);
459  elem_max.floatval = std::max(elem_max.floatval, flt_array[i]);
460  } else {
461  elem_min.floatval = flt_array[i];
462  elem_max.floatval = flt_array[i];
463  initialized = true;
464  }
465  }
466  break;
467  }
468  case kDOUBLE: {
469  if (!initialized) {
470  elem_min.doubleval = 1.0;
471  elem_max.doubleval = 0.0;
472  }
473  if (array.is_null) {
474  break;
475  }
476  const double* dbl_array = (double*)array.pointer;
477  for (size_t i = 0; i < array.length / sizeof(double); i++) {
478  if (dbl_array[i] == NULL_DOUBLE) {
479  has_nulls = true;
480  } else if (initialized) {
481  elem_min.doubleval = std::min(elem_min.doubleval, dbl_array[i]);
482  elem_max.doubleval = std::max(elem_max.doubleval, dbl_array[i]);
483  } else {
484  elem_min.doubleval = dbl_array[i];
485  elem_max.doubleval = dbl_array[i];
486  initialized = true;
487  }
488  }
489  break;
490  }
491  case kTIME:
492  case kTIMESTAMP:
493  case kDATE: {
494  if (!initialized) {
495  elem_min.bigintval = 1;
496  elem_max.bigintval = 0;
497  }
498  if (array.is_null) {
499  break;
500  }
501  const int64_t* tm_array = reinterpret_cast<int64_t*>(array.pointer);
502  for (size_t i = 0; i < array.length / sizeof(int64_t); i++) {
503  if (tm_array[i] == NULL_BIGINT) {
504  has_nulls = true;
505  } else if (initialized) {
506  elem_min.bigintval = std::min(elem_min.bigintval, tm_array[i]);
507  elem_max.bigintval = std::max(elem_max.bigintval, tm_array[i]);
508  } else {
509  elem_min.bigintval = tm_array[i];
510  elem_max.bigintval = tm_array[i];
511  initialized = true;
512  }
513  }
514  break;
515  }
516  case kCHAR:
517  case kVARCHAR:
518  case kTEXT: {
520  if (!initialized) {
521  elem_min.intval = 1;
522  elem_max.intval = 0;
523  }
524  if (array.is_null) {
525  break;
526  }
527  const int32_t* int_array = (int32_t*)array.pointer;
528  for (size_t i = 0; i < array.length / sizeof(int32_t); i++) {
529  if (int_array[i] == NULL_INT) {
530  has_nulls = true;
531  } else if (initialized) {
532  elem_min.intval = std::min(elem_min.intval, int_array[i]);
533  elem_max.intval = std::max(elem_max.intval, int_array[i]);
534  } else {
535  elem_min.intval = int_array[i];
536  elem_max.intval = int_array[i];
537  initialized = true;
538  }
539  }
540  break;
541  }
542  default:
543  UNREACHABLE();
544  }
545  };
int8_t tinyintval
Definition: Datum.h:73
HOST DEVICE SQLTypes get_subtype() const
Definition: sqltypes.h:392
#define CHECK_EQ(x, y)
Definition: Logger.h:301
#define NULL_DOUBLE
Definition: sqltypes.h:76
#define NULL_FLOAT
DecimalOverflowValidator decimal_overflow_validator_
Definition: Encoder.h:292
#define NULL_BIGINT
int8_t boolval
Definition: Datum.h:72
#define UNREACHABLE()
Definition: Logger.h:338
int32_t intval
Definition: Datum.h:75
#define NULL_INT
float floatval
Definition: Datum.h:77
Data_Namespace::AbstractBuffer * buffer_
Definition: Encoder.h:290
int64_t bigintval
Definition: Datum.h:76
int16_t smallintval
Definition: Datum.h:74
#define NULL_BOOLEAN
Definition: sqltypes.h:79
Definition: sqltypes.h:80
HOST DEVICE EncodingType get_compression() const
Definition: sqltypes.h:399
Definition: sqltypes.h:68
SQLTypeInfo getSqlType() const
#define NULL_TINYINT
#define NULL_SMALLINT
Definition: sqltypes.h:72
void validate(T value) const
Definition: Encoder.h:54
double doubleval
Definition: Datum.h:78

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

void FixedLengthArrayNoneEncoder::updateMetadata ( int8_t *  array)
inline

Definition at line 227 of file FixedLengthArrayNoneEncoder.h.

References array_size, is_null(), and update_elem_stats().

227  {
229  }
std::conditional_t< is_cuda_compiler(), DeviceArrayDatum, HostArrayDatum > ArrayDatum
Definition: sqltypes.h:229
void update_elem_stats(const ArrayDatum &array)
static bool is_null(const SQLTypeInfo &type, int8_t *array)

+ Here is the call graph for this function:

void FixedLengthArrayNoneEncoder::updateStats ( const int64_t  ,
const bool   
)
inlineoverridevirtual

Implements Encoder.

Definition at line 175 of file FixedLengthArrayNoneEncoder.h.

References CHECK.

Referenced by appendData().

175 { CHECK(false); }
#define CHECK(condition)
Definition: Logger.h:291

+ Here is the caller graph for this function:

void FixedLengthArrayNoneEncoder::updateStats ( const double  ,
const bool   
)
inlineoverridevirtual

Implements Encoder.

Definition at line 177 of file FixedLengthArrayNoneEncoder.h.

References CHECK.

177 { CHECK(false); }
#define CHECK(condition)
Definition: Logger.h:291
void FixedLengthArrayNoneEncoder::updateStats ( const int8_t *const  src_data,
const size_t  num_elements 
)
inlineoverridevirtual

Update statistics for data without appending.

Parameters
src_data- the data with which to update statistics
num_elements- the number of elements to scan in the data

Implements Encoder.

Definition at line 181 of file FixedLengthArrayNoneEncoder.h.

References UNREACHABLE.

181  {
182  UNREACHABLE();
183  }
#define UNREACHABLE()
Definition: Logger.h:338
void FixedLengthArrayNoneEncoder::updateStats ( const std::vector< std::string > *const  src_data,
const size_t  start_idx,
const size_t  num_elements 
)
inlineoverridevirtual

Update statistics for string data without appending.

Parameters
src_data- the string data with which to update statistics
start_idx- the offset into src_data to start the update
num_elements- the number of elements to scan in the string data

Implements Encoder.

Definition at line 185 of file FixedLengthArrayNoneEncoder.h.

References UNREACHABLE.

187  {
188  UNREACHABLE();
189  }
#define UNREACHABLE()
Definition: Logger.h:338
void FixedLengthArrayNoneEncoder::updateStats ( const std::vector< ArrayDatum > *const  src_data,
const size_t  start_idx,
const size_t  num_elements 
)
inlineoverridevirtual

Update statistics for array data without appending.

Parameters
src_data- the array data with which to update statistics
start_idx- the offset into src_data to start the update
num_elements- the number of elements to scan in the array data

Implements Encoder.

Definition at line 191 of file FixedLengthArrayNoneEncoder.h.

References anonymous_namespace{Utm.h}::n, and update_elem_stats().

193  {
194  for (size_t n = start_idx; n < start_idx + num_elements; n++) {
195  update_elem_stats((*src_data)[n]);
196  }
197  }
void update_elem_stats(const ArrayDatum &array)
constexpr double n
Definition: Utm.h:38

+ Here is the call graph for this function:

void FixedLengthArrayNoneEncoder::writeMetadata ( FILE *  f)
inlineoverridevirtual

Implements Encoder.

Definition at line 199 of file FixedLengthArrayNoneEncoder.h.

References elem_max, elem_min, has_nulls, initialized, and Encoder::num_elems_.

199  {
200  // assumes pointer is already in right place
201  fwrite((int8_t*)&num_elems_, sizeof(size_t), 1, f);
202  fwrite((int8_t*)&elem_min, sizeof(Datum), 1, f);
203  fwrite((int8_t*)&elem_max, sizeof(Datum), 1, f);
204  fwrite((int8_t*)&has_nulls, sizeof(bool), 1, f);
205  fwrite((int8_t*)&initialized, sizeof(bool), 1, f);
206  }
size_t num_elems_
Definition: Encoder.h:288
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)
Definition: Datum.h:71

Member Data Documentation

Datum FixedLengthArrayNoneEncoder::elem_max
Datum FixedLengthArrayNoneEncoder::elem_min
std::mutex FixedLengthArrayNoneEncoder::EncoderMutex_
private

Definition at line 311 of file FixedLengthArrayNoneEncoder.h.

bool FixedLengthArrayNoneEncoder::initialized
std::mutex FixedLengthArrayNoneEncoder::print_mutex_
private

Definition at line 312 of file FixedLengthArrayNoneEncoder.h.


The documentation for this class was generated from the following file: