OmniSciDB  a5dc49c757
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
File_Namespace Namespace Reference

Namespaces

 anonymous_namespace{FileBuffer.cpp}
 
 anonymous_namespace{FileMgr.cpp}
 

Classes

struct  DiskCacheConfig
 
class  TableFileMgr
 
class  CachingFileBuffer
 
class  CachingFileMgr
 A FileMgr capable of limiting it's size and storing data from multiple tables in a shared directory. For any table that supports DiskCaching, the CachingFileMgr must contain either metadata for all table chunks, or for none (the cache is either has no knowledge of that table, or has complete knowledge of that table). Any data chunk within a table may or may not be contained within the cache. More...
 
class  CachingGlobalFileMgr
 
struct  readThreadDS
 
class  FileBuffer
 Represents/provides access to contiguous data stored in the file system. More...
 
struct  FileInfo
 
struct  FileMetadata
 
struct  StorageStats
 
struct  OpenFilesResult
 
struct  PageMapping
 
class  FileMgr
 
struct  FileMgrParams
 
class  GlobalFileMgr
 
struct  Page
 A logical page (Page) belongs to a file on disk. More...
 
struct  EpochedPage
 
struct  MultiPage
 The MultiPage stores versions of the same logical page in a deque. More...
 
struct  HeaderInfo
 Stores Pair of ChunkKey and Page id and version, in a pair with a Page struct itself (File id and Page num) More...
 

Typedefs

using PageSizeFileMMap = std::multimap< size_t, int32_t >
 Maps logical page sizes to files. More...
 
using Chunk = FileBuffer
 A Chunk is the fundamental unit of execution in Map-D. More...
 
using ChunkKeyToChunkMap = std::map< ChunkKey, FileBuffer * >
 Maps ChunkKeys (unique ids for Chunks) to Chunk objects. More...
 
using TablePair = std::pair< const int32_t, const int32_t >
 
using PageHeaderSizeType = int32_t
 

Enumerations

enum  DiskCacheLevel { DiskCacheLevel::none, DiskCacheLevel::fsi, DiskCacheLevel::non_fsi, DiskCacheLevel::all }
 

Functions

std::ostream & operator<< (std::ostream &os, DiskCacheLevel disk_cache_level)
 
std::string get_dir_name_for_table (int db_id, int tb_id)
 
static size_t readForThread (FileBuffer *fileBuffer, const readThreadDS threadDS)
 
bool is_page_deleted_with_checkpoint (int32_t table_epoch, int32_t page_epoch, int32_t contingent)
 
bool is_page_deleted_without_checkpoint (int32_t table_epoch, int32_t page_epoch, int32_t contingent)
 
std::string get_data_file_path (const std::string &base_path, int file_id, size_t page_size)
 
std::string get_legacy_data_file_path (const std::string &new_data_file_path)
 
std::pair< FILE *, std::string > create (const std::string &basePath, const int fileId, const size_t pageSize, const size_t numPages)
 
FILE * create (const std::string &full_path, const size_t requested_file_size)
 Opens/creates a file with the given path and file size. Fatal crash if file can not be created to required size. More...
 
FILE * open (int file_id)
 Opens the file with the given id; fatal crash on error. More...
 
FILE * open (const std::string &path)
 
void close (FILE *f)
 Closes the file pointed to by the FILE pointer. More...
 
bool removeFile (const std::string &basePath, const std::string &filename)
 Deletes the file pointed to by the FILE pointer. More...
 
size_t read (FILE *f, const size_t offset, const size_t size, int8_t *buf, const std::string &file_path)
 Reads the specified number of bytes from the offset position in file f into buf. More...
 
size_t write (FILE *f, const size_t offset, const size_t size, const int8_t *buf)
 Writes the specified number of bytes to the offset position in file f from buf. More...
 
size_t append (FILE *f, const size_t size, const int8_t *buf)
 Appends the specified number of bytes to the end of the file f from buf. More...
 
size_t readPage (FILE *f, const size_t pageSize, const size_t pageNum, int8_t *buf, const std::string &file_path)
 Reads the specified page from the file f into buf. More...
 
size_t readPartialPage (FILE *f, const size_t pageSize, const size_t offset, const size_t readSize, const size_t pageNum, int8_t *buf, const std::string &file_path)
 
size_t writePage (FILE *f, const size_t pageSize, const size_t pageNum, int8_t *buf)
 Writes a page from buf to the file. More...
 
size_t writePartialPage (FILE *f, const size_t pageSize, const size_t offset, const size_t writeSize, const size_t pageNum, int8_t *buf)
 
size_t appendPage (FILE *f, const size_t pageSize, int8_t *buf)
 Appends a page from buf to the file. More...
 
size_t fileSize (FILE *f)
 Returns the size of the specified file. More...
 
void renameForDelete (const std::string directoryName)
 Renames a directory to DELETE_ME_<EPOCH>_<oldname>. More...
 

Variables

constexpr int32_t DELETE_CONTINGENT = -1
 A FileInfo type has a file pointer and metadata about a file. More...
 
constexpr int32_t ROLLOFF_CONTINGENT = -2
 
constexpr int32_t kDbVersion {2}
 DB version for DataMgr DS and corresponding file buffer read/write code. More...
 
constexpr auto kLegacyDataFileExtension {".mapd"}
 

Typedef Documentation

A Chunk is the fundamental unit of execution in Map-D.

Chunk A chunk is composed of logical pages. These pages can exist across multiple files managed by the file manager.

The collection of pages is implemented as a FileBuffer object, which is composed of a vector of MultiPage objects, one for each logical page of the file buffer.

Definition at line 80 of file FileMgr.h.

Maps ChunkKeys (unique ids for Chunks) to Chunk objects.

ChunkKeyToChunkMap The file system can store multiple chunks across multiple files. With that in mind, the challenge is to be able to reconstruct the pages that compose a chunk upon request. A chunk key (ChunkKey) uniquely identifies a chunk, and so ChunkKeyToChunkMap maps chunk keys to Chunk types, which are vectors of MultiPage* pointers (logical pages).

Definition at line 92 of file FileMgr.h.

using File_Namespace::PageHeaderSizeType = typedef int32_t

Definition at line 134 of file FileMgr.h.

using File_Namespace::PageSizeFileMMap = typedef std::multimap<size_t, int32_t>

Maps logical page sizes to files.

PageSizeFileMMap The file manager uses this type in order to quickly find files of a certain page size. A multimap is used to associate the key (page size) with values (file identifiers of files having the matching page size).

Definition at line 68 of file FileMgr.h.

using File_Namespace::TablePair = typedef std::pair<const int32_t, const int32_t>

TablePair Pair detailing the id for a database and table (first two entries in a ChunkKey).

Definition at line 98 of file FileMgr.h.

Enumeration Type Documentation

Function Documentation

size_t File_Namespace::append ( FILE *  f,
const size_t  size,
const int8_t *  buf 
)

Appends the specified number of bytes to the end of the file f from buf.

Parameters
fPointer to the FILE.
nThe number of bytes to append to the file.
bufThe source buffer containing the data to be appended.
errIf not NULL, will hold an error code should an error occur.
Returns
size_t The number of bytes written.

Definition at line 158 of file File.cpp.

References fileSize(), and write().

Referenced by benchmarks.GoogleBenchmark::aggregateBenchmarks(), run_benchmark::benchmark(), concat(), concat_with(), com.mapd.utility.SQLImporter::createDBTable(), org.apache.calcite.rel.externalize.HeavyDBRelWriterImpl::explain_(), com.mapd.calcite.parser.HeavyDBSqlOperatorTable.ExtTableFunction::getExtendedSignature(), ai.heavy.jdbc.HeavyAIPreparedStatement::getQuery(), com.mapd.parser.server.ExtensionFunctionSignatureParser::join(), run_benchmark::read_query_files(), run_benchmark_arrow::run_query(), run_benchmark::run_query(), org.apache.calcite.rel.externalize.HeavyDBRelWriterImpl::simple(), ai.heavy.jdbc.HeavyAIEscapeFunctions::singleArgumentFunctionCall(), com.mapd.parser.server.ExtensionFunction::toJson(), ai.heavy.jdbc.HeavyAIArray::toString(), and TableFunctionsFactory_transformers.AmbiguousSignatureCheckTransformer::visit_udtf_node().

158  {
159  return write(f, fileSize(f), size, buf);
160 }
size_t write(FILE *f, const size_t offset, const size_t size, const int8_t *buf)
Writes the specified number of bytes to the offset position in file f from buf.
Definition: File.cpp:143
size_t fileSize(FILE *f)
Returns the size of the specified file.
Definition: File.cpp:198
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

size_t File_Namespace::appendPage ( FILE *  f,
const size_t  pageSize,
int8_t *  buf 
)

Appends a page from buf to the file.

Parameters
fPointer to the FILE.
pageSizeThe logical page size of the file.
bufThe source buffer from where data is being read.
errIf not NULL, will hold an error code should an error occur.
Returns
size_t The number of bytes appended (should be equal to pageSize).

Definition at line 193 of file File.cpp.

References fileSize(), and write().

193  {
194  return write(f, fileSize(f), pageSize, buf);
195 }
size_t write(FILE *f, const size_t offset, const size_t size, const int8_t *buf)
Writes the specified number of bytes to the offset position in file f from buf.
Definition: File.cpp:143
size_t fileSize(FILE *f)
Returns the size of the specified file.
Definition: File.cpp:198
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the call graph for this function:

void File_Namespace::close ( FILE *  f)

Closes the file pointed to by the FILE pointer.

Parameters
fPointer to the FILE.

Definition at line 114 of file File.cpp.

References CHECK, and CHECK_EQ.

Referenced by File_Namespace::FileMgr::closePhysicalUnlocked(), File_Namespace::FileMgr::openAndReadLegacyEpochFile(), File_Namespace::FileMgr::readVersionFromDisk(), File_Namespace::FileMgr::writeAndSyncVersionToDisk(), File_Namespace::FileInfo::~FileInfo(), File_Namespace::FileMgr::~FileMgr(), and File_Namespace::TableFileMgr::~TableFileMgr().

114  {
115  CHECK(f);
116  CHECK_EQ(fflush(f), 0);
117  CHECK_EQ(fclose(f), 0);
118 }
#define CHECK_EQ(x, y)
Definition: Logger.h:301
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)
#define CHECK(condition)
Definition: Logger.h:291

+ Here is the caller graph for this function:

std::pair< FILE *, std::string > File_Namespace::create ( const std::string &  basePath,
const int  fileId,
const size_t  pageSize,
const size_t  numPages 
)

Definition at line 55 of file File.cpp.

References f(), logger::FATAL, nvtx_helpers::anonymous_namespace{nvtx_helpers.cpp}::filename(), fileSize(), heavyai::fopen(), get_data_file_path(), get_legacy_data_file_path(), and LOG.

Referenced by File_Namespace::FileMgr::createEpochFile(), File_Namespace::FileMgr::createFile(), File_Namespace::FileMgr::createFileInfo(), UdfCompiler::generateAST(), kafka_insert(), org.apache.calcite.sql2rel.SqlToRelConverter::SqlToRelConverter(), File_Namespace::TableFileMgr::TableFileMgr(), and File_Namespace::FileMgr::writeAndSyncVersionToDisk().

58  {
59  auto path = get_data_file_path(basePath, fileId, pageSize);
60  if (numPages < 1 || pageSize < 1) {
61  LOG(FATAL) << "Error trying to create file '" << path
62  << "', Number of pages and page size must be positive integers. numPages "
63  << numPages << " pageSize " << pageSize;
64  }
65  FILE* f = heavyai::fopen(path.c_str(), "w+b");
66  if (f == nullptr) {
67  LOG(FATAL) << "Error trying to create file '" << path
68  << "', the error was: " << std::strerror(errno);
69  }
70  fseek(f, static_cast<long>((pageSize * numPages) - 1), SEEK_SET);
71  fputc(EOF, f);
72  fseek(f, 0, SEEK_SET); // rewind
73  if (fileSize(f) != pageSize * numPages) {
74  LOG(FATAL) << "Error trying to create file '" << path << "', file size "
75  << fileSize(f) << " does not equal pageSize * numPages "
76  << pageSize * numPages;
77  }
78  boost::filesystem::create_symlink(boost::filesystem::canonical(path).filename(),
80  return {f, path};
81 }
std::string get_legacy_data_file_path(const std::string &new_data_file_path)
Definition: File.cpp:49
#define LOG(tag)
Definition: Logger.h:285
::FILE * fopen(const char *filename, const char *mode)
Definition: heavyai_fs.cpp:74
size_t fileSize(FILE *f)
Returns the size of the specified file.
Definition: File.cpp:198
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)
std::string get_data_file_path(const std::string &base_path, int file_id, size_t page_size)
Definition: File.cpp:42

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

FILE * File_Namespace::create ( const std::string &  full_path,
const size_t  requested_file_size 
)

Opens/creates a file with the given path and file size. Fatal crash if file can not be created to required size.

Parameters
full_pathFull path to file. ^
requested_file_sizeFile size file must be created with.
Returns
FILE* A pointer to a FILE pointer; cannot be null.

Definition at line 83 of file File.cpp.

References f(), logger::FATAL, fileSize(), heavyai::fopen(), and LOG.

83  {
84  FILE* f = heavyai::fopen(full_path.c_str(), "w+b");
85  if (f == nullptr) {
86  LOG(FATAL) << "Error trying to create file '" << full_path
87  << "', the error was: " << std::strerror(errno);
88  }
89  fseek(f, static_cast<long>(requested_file_size - 1), SEEK_SET);
90  fputc(EOF, f);
91  fseek(f, 0, SEEK_SET); // rewind
92  if (fileSize(f) != requested_file_size) {
93  LOG(FATAL) << "Error trying to create file '" << full_path << "', file size "
94  << fileSize(f) << " does not equal requested_file_size "
95  << requested_file_size;
96  }
97  return f;
98 }
#define LOG(tag)
Definition: Logger.h:285
::FILE * fopen(const char *filename, const char *mode)
Definition: heavyai_fs.cpp:74
size_t fileSize(FILE *f)
Returns the size of the specified file.
Definition: File.cpp:198
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the call graph for this function:

size_t File_Namespace::fileSize ( FILE *  f)

Returns the size of the specified file.

Todo:
There may be an issue casting to size_t from long.
Parameters
fA pointer to the file.
Returns
size_t The number of bytes of the file.

Definition at line 198 of file File.cpp.

Referenced by append(), appendPage(), and create().

198  {
199  fseek(f, 0, SEEK_END);
200  size_t size = (size_t)ftell(f);
201  fseek(f, 0, SEEK_SET);
202  return size;
203 }
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the caller graph for this function:

std::string File_Namespace::get_data_file_path ( const std::string &  base_path,
int  file_id,
size_t  page_size 
)

Definition at line 42 of file File.cpp.

References DATA_FILE_EXT, and to_string().

Referenced by create(), File_Namespace::FileMgr::createFile(), File_Namespace::FileMgr::createFileInfo(), and File_Namespace::FileMgr::deleteEmptyFiles().

44  {
45  return base_path + "/" + std::to_string(file_id) + "." + std::to_string(page_size) +
46  std::string(DATA_FILE_EXT); // DATA_FILE_EXT has preceding "."
47 }
#define DATA_FILE_EXT
Definition: File.h:25
std::string to_string(char const *&&v)

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

std::string File_Namespace::get_dir_name_for_table ( int  db_id,
int  tb_id 
)
inline

Definition at line 90 of file CachingFileMgr.h.

Referenced by File_Namespace::CachingFileMgr::getTableFileMgrPath().

90  {
91  std::stringstream file_name;
92  file_name << "table_" << db_id << "_" << tb_id << "/";
93  return file_name.str();
94 }

+ Here is the caller graph for this function:

std::string File_Namespace::get_legacy_data_file_path ( const std::string &  new_data_file_path)

Definition at line 49 of file File.cpp.

References kLegacyDataFileExtension.

Referenced by create(), and File_Namespace::FileMgr::deleteEmptyFiles().

49  {
50  auto legacy_path = boost::filesystem::canonical(new_data_file_path);
51  legacy_path.replace_extension(kLegacyDataFileExtension);
52  return legacy_path.string();
53 }
constexpr auto kLegacyDataFileExtension
Definition: File.h:36

+ Here is the caller graph for this function:

bool File_Namespace::is_page_deleted_with_checkpoint ( int32_t  table_epoch,
int32_t  page_epoch,
int32_t  contingent 
)

Definition at line 259 of file FileInfo.cpp.

References DELETE_CONTINGENT, and ROLLOFF_CONTINGENT.

Referenced by anonymous_namespace{TableArchiver.cpp}::update_or_drop_column_ids_in_page_headers(), and File_Namespace::FileMgr::updatePageIfDeleted().

261  {
262  const bool delete_contingent =
263  (contingent == DELETE_CONTINGENT || contingent == ROLLOFF_CONTINGENT);
264  // Check if page was deleted with a checkpointed epoch
265  if (delete_contingent && (table_epoch >= page_epoch)) {
266  return true;
267  }
268  return false;
269 }
constexpr int32_t DELETE_CONTINGENT
A FileInfo type has a file pointer and metadata about a file.
Definition: FileInfo.h:51
constexpr int32_t ROLLOFF_CONTINGENT
Definition: FileInfo.h:52

+ Here is the caller graph for this function:

bool File_Namespace::is_page_deleted_without_checkpoint ( int32_t  table_epoch,
int32_t  page_epoch,
int32_t  contingent 
)

Definition at line 271 of file FileInfo.cpp.

References DELETE_CONTINGENT, and ROLLOFF_CONTINGENT.

Referenced by File_Namespace::FileMgr::updatePageIfDeleted().

273  {
274  const bool delete_contingent =
275  (contingent == DELETE_CONTINGENT || contingent == ROLLOFF_CONTINGENT);
276  // Check if page was deleted but the epoch was not yet checkpointed.
277  if (delete_contingent && (table_epoch < page_epoch)) {
278  return true;
279  }
280  return false;
281 }
constexpr int32_t DELETE_CONTINGENT
A FileInfo type has a file pointer and metadata about a file.
Definition: FileInfo.h:51
constexpr int32_t ROLLOFF_CONTINGENT
Definition: FileInfo.h:52

+ Here is the caller graph for this function:

FILE * File_Namespace::open ( int  file_id)

Opens the file with the given id; fatal crash on error.

Parameters
file_idThe id of the file to open.
Returns
FILE* A pointer to a FILE pointer; cannot be null.

Definition at line 100 of file File.cpp.

References DATA_FILE_EXT, and to_string().

Referenced by File_Namespace::FileMgr::openAndReadEpochFile(), File_Namespace::FileMgr::openAndReadLegacyEpochFile(), File_Namespace::FileMgr::openExistingFile(), File_Namespace::FileMgr::readVersionFromDisk(), File_Namespace::TableFileMgr::TableFileMgr(), and File_Namespace::FileMgr::writeAndSyncVersionToDisk().

100  {
101  std::string s(std::to_string(file_id) + std::string(DATA_FILE_EXT));
102  return open(s);
103 }
#define DATA_FILE_EXT
Definition: File.h:25
std::string to_string(char const *&&v)
int open(const char *path, int flags, int mode)
Definition: heavyai_fs.cpp:66

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

FILE * File_Namespace::open ( const std::string &  path)

Definition at line 105 of file File.cpp.

References f(), logger::FATAL, heavyai::fopen(), and LOG.

105  {
106  FILE* f = heavyai::fopen(path.c_str(), "r+b");
107  if (f == nullptr) {
108  LOG(FATAL) << "Error trying to open file '" << path
109  << "', the errno was: " << std::strerror(errno);
110  }
111  return f;
112 }
#define LOG(tag)
Definition: Logger.h:285
::FILE * fopen(const char *filename, const char *mode)
Definition: heavyai_fs.cpp:74
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the call graph for this function:

std::ostream & File_Namespace::operator<< ( std::ostream &  os,
DiskCacheLevel  disk_cache_level 
)

Definition at line 847 of file CachingFileMgr.cpp.

References all, fsi, non_fsi, none, and UNREACHABLE.

847  {
848  if (disk_cache_level == DiskCacheLevel::none) {
849  os << "None";
850  } else if (disk_cache_level == DiskCacheLevel::fsi) {
851  os << "ForeignTables";
852  } else if (disk_cache_level == DiskCacheLevel::non_fsi) {
853  os << "LocalTables";
854  } else if (disk_cache_level == DiskCacheLevel::all) {
855  os << "All";
856  } else {
857  UNREACHABLE() << "Unexpected disk cache level: "
858  << static_cast<int32_t>(disk_cache_level);
859  }
860  return os;
861 }
#define UNREACHABLE()
Definition: Logger.h:338
size_t File_Namespace::read ( FILE *  f,
const size_t  offset,
const size_t  size,
int8_t *  buf,
const std::string &  file_path 
)

Reads the specified number of bytes from the offset position in file f into buf.

Parameters
fPointer to the FILE.
offsetThe location within the file from which to read.
sizeThe number of bytes to be read.
bufThe destination buffer to where data is being read from the file.
file_pathPath of file to read from.
Returns
size_t The number of bytes read.

Definition at line 125 of file File.cpp.

References CHECK_EQ.

Referenced by File_Namespace::FileMgr::openAndReadEpochFile(), File_Namespace::FileMgr::openAndReadLegacyEpochFile(), File_Namespace::FileInfo::read(), readPage(), readPartialPage(), File_Namespace::FileMgr::readVersionFromDisk(), heavyai::safe_read(), and File_Namespace::TableFileMgr::TableFileMgr().

129  {
130  // read "size" bytes from the offset location in the file into the buffer
131  CHECK_EQ(fseek(f, static_cast<long>(offset), SEEK_SET), 0);
132  size_t bytesRead = fread(buf, sizeof(int8_t), size, f);
133  auto expected_bytes_read = sizeof(int8_t) * size;
134  CHECK_EQ(bytesRead, expected_bytes_read)
135  << "Unexpected number of bytes read from file: " << file_path
136  << ". Expected bytes read: " << expected_bytes_read
137  << ", actual bytes read: " << bytesRead << ", offset: " << offset
138  << ", file stream error set: " << (std::ferror(f) ? "true" : "false")
139  << ", EOF reached: " << (std::feof(f) ? "true" : "false");
140  return bytesRead;
141 }
#define CHECK_EQ(x, y)
Definition: Logger.h:301
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the caller graph for this function:

static size_t File_Namespace::readForThread ( FileBuffer *  fileBuffer,
const readThreadDS  threadDS 
)
static

Definition at line 256 of file FileBuffer.cpp.

References CHECK, File_Namespace::FileMgr::getFileInfoForFileId(), File_Namespace::readThreadDS::multiPages, File_Namespace::FileBuffer::pageDataSize(), File_Namespace::FileBuffer::pageSize(), File_Namespace::FileInfo::read(), File_Namespace::FileBuffer::reservedHeaderSize(), File_Namespace::readThreadDS::t_bytesLeft, File_Namespace::readThreadDS::t_curPtr, File_Namespace::readThreadDS::t_endPage, File_Namespace::readThreadDS::t_fm, File_Namespace::readThreadDS::t_isFirstPage, File_Namespace::readThreadDS::t_startPage, and File_Namespace::readThreadDS::t_startPageOffset.

Referenced by File_Namespace::FileBuffer::read().

256  {
257  size_t startPage = threadDS.t_startPage; // start reading at startPage, including it
258  size_t endPage = threadDS.t_endPage; // stop reading at endPage, not including it
259  int8_t* curPtr = threadDS.t_curPtr;
260  size_t bytesLeft = threadDS.t_bytesLeft;
261  size_t totalBytesRead = 0;
262  bool isFirstPage = threadDS.t_isFirstPage;
263 
264  // Traverse the logical pages
265  for (size_t pageNum = startPage; pageNum < endPage; ++pageNum) {
266  CHECK(threadDS.multiPages[pageNum].pageSize == fileBuffer->pageSize());
267  Page page = threadDS.multiPages[pageNum].current().page;
268 
269  FileInfo* fileInfo = threadDS.t_fm->getFileInfoForFileId(page.fileId);
270  CHECK(fileInfo);
271 
272  // Read the page into the destination (dst) buffer at its
273  // current (cur) location
274  size_t bytesRead = 0;
275  if (isFirstPage) {
276  bytesRead = fileInfo->read(
277  page.pageNum * fileBuffer->pageSize() + threadDS.t_startPageOffset +
278  fileBuffer->reservedHeaderSize(),
279  min(fileBuffer->pageDataSize() - threadDS.t_startPageOffset, bytesLeft),
280  curPtr);
281  isFirstPage = false;
282  } else {
283  bytesRead = fileInfo->read(
284  page.pageNum * fileBuffer->pageSize() + fileBuffer->reservedHeaderSize(),
285  min(fileBuffer->pageDataSize(), bytesLeft),
286  curPtr);
287  }
288  curPtr += bytesRead;
289  bytesLeft -= bytesRead;
290  totalBytesRead += bytesRead;
291  }
292  CHECK(bytesLeft == 0);
293 
294  return (totalBytesRead);
295 }
#define CHECK(condition)
Definition: Logger.h:291

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

size_t File_Namespace::readPage ( FILE *  f,
const size_t  pageSize,
const size_t  pageNum,
int8_t *  buf,
const std::string &  file_path 
)

Reads the specified page from the file f into buf.

Parameters
fPointer to the FILE.
pageSizeThe logical page size of the file.
pageNumThe page number from where data is being read.
bufThe destination buffer to where data is being written.
file_pathPath of file to read from.
Returns
size_t The number of bytes read (should be equal to pageSize).

Definition at line 162 of file File.cpp.

References read().

166  {
167  return read(f, pageNum * pageSize, pageSize, buf, file_path);
168 }
size_t read(FILE *f, const size_t offset, const size_t size, int8_t *buf, const std::string &file_path)
Reads the specified number of bytes from the offset position in file f into buf.
Definition: File.cpp:125
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the call graph for this function:

size_t File_Namespace::readPartialPage ( FILE *  f,
const size_t  pageSize,
const size_t  offset,
const size_t  readSize,
const size_t  pageNum,
int8_t *  buf,
const std::string &  file_path 
)

Definition at line 170 of file File.cpp.

References read().

176  {
177  return read(f, pageNum * pageSize + offset, readSize, buf, file_path);
178 }
size_t read(FILE *f, const size_t offset, const size_t size, int8_t *buf, const std::string &file_path)
Reads the specified number of bytes from the offset position in file f into buf.
Definition: File.cpp:125
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the call graph for this function:

bool File_Namespace::removeFile ( const std::string &  basePath,
const std::string &  filename 
)

Deletes the file pointed to by the FILE pointer.

Parameters
basePathThe base path (directory) of the file.
fPointer to the FILE.
Returns
mapd_err_t Returns an error code when unable to close the file properly.

Definition at line 120 of file File.cpp.

References nvtx_helpers::anonymous_namespace{nvtx_helpers.cpp}::filename().

120  {
121  const std::string file_path = base_path + filename;
122  return remove(file_path.c_str()) == 0;
123 }

+ Here is the call graph for this function:

void File_Namespace::renameForDelete ( const std::string  directoryName)

Renames a directory to DELETE_ME_<EPOCH>_<oldname>.

Parameters
directoryNamename of directory

Definition at line 210 of file File.cpp.

References logger::ERROR, report::error_code(), logger::FATAL, LOG, and to_string().

Referenced by File_Namespace::FileMgr::closeRemovePhysical(), Catalog_Namespace::Catalog::delDictionaryNontransactional(), Catalog_Namespace::Catalog::doTruncateTable(), and Catalog_Namespace::Catalog::removeTableFromMap().

210  {
212  boost::filesystem::path directoryPath(directoryName);
213  using namespace std::chrono;
214  milliseconds ms = duration_cast<milliseconds>(system_clock::now().time_since_epoch());
215 
216  if (boost::filesystem::exists(directoryPath) &&
217  boost::filesystem::is_directory(directoryPath)) {
218  boost::filesystem::path newDirectoryPath(directoryName + "_" +
219  std::to_string(ms.count()) + "_DELETE_ME");
220  boost::filesystem::rename(directoryPath, newDirectoryPath, ec);
221 
222 #ifdef _WIN32
223  // On Windows we sometimes fail to rename a directory with System: 5 error
224  // code (access denied). An attempt to stop in debugger and look for opened
225  // handles for some of directory content shows no opened handles and actually
226  // allows renaming to execute successfully. It's not clear why, but a short
227  // pause allows to rename directory successfully. Until reasons are known,
228  // use this retry loop as a workaround.
229  int tries = 10;
230  while (ec.value() != boost::system::errc::success && tries) {
231  LOG(ERROR) << "Failed to rename directory " << directoryPath << " error was " << ec
232  << " (" << tries << " attempts left)";
233  std::this_thread::sleep_for(std::chrono::milliseconds(100 / tries));
234  tries--;
235  boost::filesystem::rename(directoryPath, newDirectoryPath, ec);
236  }
237 #endif
238 
239  if (ec.value() == boost::system::errc::success) {
240  std::thread th([newDirectoryPath]() {
242  boost::filesystem::remove_all(newDirectoryPath, ec);
243  // We dont check error on remove here as we cant log the
244  // issue fromdetached thrad, its not safe to LOG from here
245  // This is under investigation as clang detects TSAN issue data race
246  // the main system wide file_delete_thread will clean up any missed files
247  });
248  // let it run free so we can return
249  // if it fails the file_delete_thread in DBHandler will clean up
250  th.detach();
251 
252  return;
253  }
254 
255  LOG(FATAL) << "Failed to rename file " << directoryName << " to "
256  << directoryName + "_" + std::to_string(ms.count()) + "_DELETE_ME Error: "
257  << ec;
258  }
259 }
#define LOG(tag)
Definition: Logger.h:285
std::string to_string(char const *&&v)
def error_code
Definition: report.py:234

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

size_t File_Namespace::write ( FILE *  f,
const size_t  offset,
const size_t  size,
const int8_t *  buf 
)

Writes the specified number of bytes to the offset position in file f from buf.

Parameters
fPointer to the FILE.
offsetThe location within the file where data is being written.
sizeThe number of bytes to write to the file.
bufThe source buffer containing the data to be written.
errIf not NULL, will hold an error code should an error occur.
Returns
size_t The number of bytes written.

Definition at line 143 of file File.cpp.

References logger::FATAL, and LOG.

Referenced by StringDictionary::addStorageCapacity(), append(), appendPage(), lockmgr::TableLockMgrImpl< T >::getClusterTableMutex(), import_export::DataStreamSink::import_compressed(), heavyai::safe_write(), File_Namespace::TableFileMgr::writeAndSyncEpochToDisk(), File_Namespace::FileMgr::writeAndSyncVersionToDisk(), File_Namespace::FileMgr::writeFile(), writePage(), and writePartialPage().

143  {
144  // write size bytes from the buffer to the offset location in the file
145  if (fseek(f, static_cast<long>(offset), SEEK_SET) != 0) {
146  LOG(FATAL)
147  << "Error trying to write to file (during positioning seek) the error was: "
148  << std::strerror(errno);
149  }
150  size_t bytesWritten = fwrite(buf, sizeof(int8_t), size, f);
151  if (bytesWritten != sizeof(int8_t) * size) {
152  LOG(FATAL) << "Error trying to write to file (during fwrite) the error was: "
153  << std::strerror(errno);
154  }
155  return bytesWritten;
156 }
#define LOG(tag)
Definition: Logger.h:285
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the caller graph for this function:

size_t File_Namespace::writePage ( FILE *  f,
const size_t  pageSize,
const size_t  pageNum,
int8_t *  buf 
)

Writes a page from buf to the file.

Parameters
fPointer to the FILE.
pageSizeThe logical page size of the file.
pageNumThe page number to where data is being written.
bufThe source buffer from where data is being read.
errIf not NULL, will hold an error code should an error occur.
Returns
size_t The number of bytes written (should be equal to pageSize).

Definition at line 180 of file File.cpp.

References write().

180  {
181  return write(f, pageNum * pageSize, pageSize, buf);
182 }
size_t write(FILE *f, const size_t offset, const size_t size, const int8_t *buf)
Writes the specified number of bytes to the offset position in file f from buf.
Definition: File.cpp:143
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the call graph for this function:

size_t File_Namespace::writePartialPage ( FILE *  f,
const size_t  pageSize,
const size_t  offset,
const size_t  writeSize,
const size_t  pageNum,
int8_t *  buf 
)

Definition at line 184 of file File.cpp.

References write().

189  {
190  return write(f, pageNum * pageSize + offset, writeSize, buf);
191 }
size_t write(FILE *f, const size_t offset, const size_t size, const int8_t *buf)
Writes the specified number of bytes to the offset position in file f from buf.
Definition: File.cpp:143
torch::Tensor f(torch::Tensor x, torch::Tensor W_target, torch::Tensor b_target)

+ Here is the call graph for this function:

Variable Documentation

constexpr int32_t File_Namespace::DELETE_CONTINGENT = -1

A FileInfo type has a file pointer and metadata about a file.

FileInfo A file info structure wraps around a file pointer in order to contain additional information/metadata about the file that is pertinent to the file manager.

The free pages (freePages) within a file must be tracked, and this is implemented using a basic STL set. The set ensures that no duplicate pages are included, and that the pages are sorted, faciliating the obtaining of consecutive free pages by a constant time pop operation, which may reduce the cost of DBMS disk accesses.

Helper functions are provided: size(), available(), and used().

Definition at line 51 of file FileInfo.h.

Referenced by File_Namespace::FileInfo::freePage(), is_page_deleted_with_checkpoint(), is_page_deleted_without_checkpoint(), and File_Namespace::CachingFileMgr::updatePageIfDeleted().

constexpr int32_t File_Namespace::kDbVersion {2}

DB version for DataMgr DS and corresponding file buffer read/write code.

Definition at line 57 of file FileMgr.h.

constexpr auto File_Namespace::kLegacyDataFileExtension {".mapd"}
constexpr int32_t File_Namespace::ROLLOFF_CONTINGENT = -2