OmniSciDB  a5dc49c757
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
File_Namespace::FileInfo Struct Reference

#include <FileInfo.h>

+ Collaboration diagram for File_Namespace::FileInfo:

Public Member Functions

 FileInfo (FileMgr *fileMgr, const int32_t fileId, FILE *f, const size_t pageSize, const size_t numPages, const std::string &file_path, const bool init=false)
 Constructor. More...
 
 ~FileInfo ()
 Destructor. More...
 
void initNewFile ()
 Adds all pages to freePages and zeroes first four bytes of header. More...
 
void freePageDeferred (int32_t pageId)
 
void freePage (int32_t pageId, const bool isRolloff, int32_t epoch)
 
int32_t getFreePage ()
 
size_t write (const size_t offset, const size_t size, const int8_t *buf)
 
size_t read (const size_t offset, const size_t size, int8_t *buf)
 
void openExistingFile (std::vector< HeaderInfo > &headerVec)
 
std::string print () const
 Prints a summary of the file to stdout. More...
 
size_t size () const
 Returns the number of bytes used by the file. More...
 
int32_t syncToDisk ()
 
size_t available () const
 Returns the number of free bytes available. More...
 
size_t numFreePages () const
 Returns the number of free pages available. More...
 
std::set< size_t > getFreePages () const
 
size_t used () const
 Returns the amount of used bytes; size() - available() More...
 
void freePageImmediate (int32_t page_num)
 
void recoverPage (const ChunkKey &chunk_key, int32_t page_num)
 

Public Attributes

FileMgrfileMgr
 
int32_t fileId
 
FILE * f
 unique file identifier (i.e., used for a file name) More...
 
size_t pageSize
 file stream object for the represented file More...
 
size_t numPages
 the fixed size of each page in the file More...
 
bool isDirty {false}
 the number of pages in the file More...
 
std::set< size_t > freePages
 
std::string file_path
 set of page numbers of free pages More...
 
std::mutex freePagesMutex_
 
std::mutex readWriteMutex_
 

Detailed Description

Definition at line 55 of file FileInfo.h.

Constructor & Destructor Documentation

File_Namespace::FileInfo::FileInfo ( FileMgr fileMgr,
const int32_t  fileId,
FILE *  f,
const size_t  pageSize,
const size_t  numPages,
const std::string &  file_path,
const bool  init = false 
)

Constructor.

Definition at line 31 of file FileInfo.cpp.

References initNewFile().

38  : fileMgr(fileMgr)
39  , fileId(fileId)
40  , f(f)
43  , file_path(file_path) {
44  if (init) {
45  initNewFile();
46  }
47 }
std::string file_path
set of page numbers of free pages
Definition: FileInfo.h:63
size_t pageSize
file stream object for the represented file
Definition: FileInfo.h:59
void init(LogOptions const &log_opts)
Definition: Logger.cpp:364
void initNewFile()
Adds all pages to freePages and zeroes first four bytes of header.
Definition: FileInfo.cpp:56
FILE * f
unique file identifier (i.e., used for a file name)
Definition: FileInfo.h:58
size_t numPages
the fixed size of each page in the file
Definition: FileInfo.h:60

+ Here is the call graph for this function:

File_Namespace::FileInfo::~FileInfo ( )

Destructor.

Definition at line 49 of file FileInfo.cpp.

References File_Namespace::close(), and f.

49  {
50  // close file, if applicable
51  if (f) {
52  close(f);
53  }
54 }
FILE * f
unique file identifier (i.e., used for a file name)
Definition: FileInfo.h:58
void close(FILE *f)
Closes the file pointed to by the FILE pointer.
Definition: File.cpp:114

+ Here is the call graph for this function:

Member Function Documentation

size_t File_Namespace::FileInfo::available ( ) const
inline

Returns the number of free bytes available.

Definition at line 102 of file FileInfo.h.

References numFreePages(), and pageSize.

Referenced by print(), and used().

102 { return numFreePages() * pageSize; }
size_t numFreePages() const
Returns the number of free pages available.
Definition: FileInfo.h:105
size_t pageSize
file stream object for the represented file
Definition: FileInfo.h:59

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

void File_Namespace::FileInfo::freePage ( int32_t  pageId,
const bool  isRolloff,
int32_t  epoch 
)

Definition at line 187 of file FileInfo.cpp.

References CHECK, File_Namespace::DELETE_CONTINGENT, fileMgr, File_Namespace::FileMgr::free_page(), pageSize, File_Namespace::ROLLOFF_CONTINGENT, and write().

Referenced by File_Namespace::FileBuffer::freePage(), and File_Namespace::CachingFileMgr::updatePageIfDeleted().

187  {
188  int32_t epoch_freed_page[2] = {DELETE_CONTINGENT, epoch};
189  if (isRolloff) {
190  epoch_freed_page[0] = ROLLOFF_CONTINGENT;
191  }
192  write(pageId * pageSize + sizeof(int32_t),
193  sizeof(epoch_freed_page),
194  reinterpret_cast<const int8_t*>(epoch_freed_page));
195  fileMgr->free_page(std::make_pair(this, pageId));
196 
197 #ifdef ENABLE_CRASH_CORRUPTION_TEST
198  signal(SIGUSR2, sighandler);
199  if (goto_crash)
200  CHECK(pageId % 8 != 4);
201 #endif
202 }
size_t write(const size_t offset, const size_t size, const int8_t *buf)
Definition: FileInfo.cpp:64
size_t pageSize
file stream object for the represented file
Definition: FileInfo.h:59
constexpr int32_t DELETE_CONTINGENT
A FileInfo type has a file pointer and metadata about a file.
Definition: FileInfo.h:51
constexpr int32_t ROLLOFF_CONTINGENT
Definition: FileInfo.h:52
virtual void free_page(std::pair< FileInfo *, int32_t > &&page)
Definition: FileMgr.cpp:1210
#define CHECK(condition)
Definition: Logger.h:291

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

void File_Namespace::FileInfo::freePageDeferred ( int32_t  pageId)

Definition at line 172 of file FileInfo.cpp.

References freePages, and freePagesMutex_.

Referenced by freePageImmediate().

172  {
173  std::lock_guard<std::mutex> lock(freePagesMutex_);
174  freePages.insert(pageId);
175 }
std::set< size_t > freePages
Definition: FileInfo.h:62
std::mutex freePagesMutex_
Definition: FileInfo.h:64

+ Here is the caller graph for this function:

void File_Namespace::FileInfo::freePageImmediate ( int32_t  page_num)

Definition at line 245 of file FileInfo.cpp.

References freePageDeferred(), pageSize, and write().

Referenced by initNewFile(), openExistingFile(), and File_Namespace::FileMgr::updatePageIfDeleted().

245  {
246  int32_t zero{0};
247  write(page_num * pageSize, sizeof(int32_t), reinterpret_cast<const int8_t*>(&zero));
248  freePageDeferred(page_num);
249 }
size_t write(const size_t offset, const size_t size, const int8_t *buf)
Definition: FileInfo.cpp:64
size_t pageSize
file stream object for the represented file
Definition: FileInfo.h:59
void freePageDeferred(int32_t pageId)
Definition: FileInfo.cpp:172

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

int32_t File_Namespace::FileInfo::getFreePage ( )

Definition at line 204 of file FileInfo.cpp.

References freePages, and freePagesMutex_.

Referenced by File_Namespace::FileMgr::copySourcePageForCompaction(), File_Namespace::FileMgr::requestFreePage(), File_Namespace::CachingFileMgr::requestFreePage(), and File_Namespace::FileMgr::requestFreePages().

204  {
205  // returns -1 if there is no free page
206  std::lock_guard<std::mutex> lock(freePagesMutex_);
207  if (freePages.size() == 0) {
208  return -1;
209  }
210  auto pageIt = freePages.begin();
211  int32_t pageNum = *pageIt;
212  freePages.erase(pageIt);
213  return pageNum;
214 }
std::set< size_t > freePages
Definition: FileInfo.h:62
std::mutex freePagesMutex_
Definition: FileInfo.h:64

+ Here is the caller graph for this function:

std::set<size_t> File_Namespace::FileInfo::getFreePages ( ) const
inline

Definition at line 110 of file FileInfo.h.

References freePages, and freePagesMutex_.

110  {
111  std::lock_guard<std::mutex> lock(freePagesMutex_);
112  return freePages;
113  }
std::set< size_t > freePages
Definition: FileInfo.h:62
std::mutex freePagesMutex_
Definition: FileInfo.h:64
void File_Namespace::FileInfo::initNewFile ( )

Adds all pages to freePages and zeroes first four bytes of header.

Definition at line 56 of file FileInfo.cpp.

References freePageImmediate(), and numPages.

Referenced by FileInfo().

56  {
57  // initialize pages and free page list
58  // Also zeroes out first four bytes of every header
59  for (size_t pageId = 0; pageId < numPages; ++pageId) {
60  freePageImmediate(pageId);
61  }
62 }
void freePageImmediate(int32_t page_num)
Definition: FileInfo.cpp:245
size_t numPages
the fixed size of each page in the file
Definition: FileInfo.h:60

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

size_t File_Namespace::FileInfo::numFreePages ( ) const
inline

Returns the number of free pages available.

Definition at line 105 of file FileInfo.h.

References freePages, and freePagesMutex_.

Referenced by available().

105  {
106  std::lock_guard<std::mutex> lock(freePagesMutex_);
107  return freePages.size();
108  }
std::set< size_t > freePages
Definition: FileInfo.h:62
std::mutex freePagesMutex_
Definition: FileInfo.h:64

+ Here is the caller graph for this function:

void File_Namespace::FileInfo::openExistingFile ( std::vector< HeaderInfo > &  headerVec)

Definition at line 75 of file FileInfo.cpp.

References CHECK_EQ, CHECK_GE, CHUNK_KEY_DB_IDX, CHUNK_KEY_TABLE_IDX, File_Namespace::FileMgr::epoch(), f, fileId, fileMgr, freePageImmediate(), freePages, g_multi_instance, g_read_only, LOG, numPages, pageSize, show_chunk(), File_Namespace::FileMgr::updatePageIfDeleted(), VLOG, and logger::WARNING.

75  {
76  // HeaderInfo is defined in Page.h
77 
78  // Oct 2020: Changing semantics such that fileMgrEpoch should be last checkpointed
79  // epoch, not incremented epoch. This changes some of the gt/gte/lt/lte comparison below
80  ChunkKey oldChunkKey(4);
81  int32_t oldPageId = -99;
82  int32_t oldVersionEpoch = -99;
83  int32_t skipped = 0;
84  for (size_t pageNum = 0; pageNum < numPages; ++pageNum) {
85  // TODO(Misiu): It would be nice to replace this array with a struct that would
86  // clarify what is being read and have a single definition (currently this code is
87  // replicated in TableArchiver and possibly elsewhere).
88  constexpr size_t MAX_INTS_TO_READ{10}; // currently use 1+6 ints
89  int32_t ints[MAX_INTS_TO_READ];
90  CHECK_EQ(fseek(f, pageNum * pageSize, SEEK_SET), 0);
91  CHECK_EQ(fread(ints, sizeof(int32_t), MAX_INTS_TO_READ, f), MAX_INTS_TO_READ);
92 
93  auto headerSize = ints[0];
94  if (headerSize == 0) {
95  // no header for this page - insert into free list
96  freePages.insert(pageNum);
97  continue;
98  }
99 
100  // headerSize doesn't include headerSize itself
101  // We're tying ourself to headers of ints here
102  size_t numHeaderElems = headerSize / sizeof(int32_t);
103  CHECK_GE(numHeaderElems, size_t(2));
104  // We don't want to read headerSize in our header - so start
105  // reading 4 bytes past it
106  ChunkKey chunkKey(&ints[1], &ints[1 + numHeaderElems - 2]);
107  if (fileMgr->updatePageIfDeleted(this, chunkKey, ints[1], ints[2], pageNum)) {
108  continue;
109  }
110  // Last two elements of header are always PageId and Version
111  // epoch - these are not in the chunk key so seperate them
112  int32_t pageId = ints[1 + numHeaderElems - 2];
113  int32_t versionEpoch = ints[1 + numHeaderElems - 1];
114  if (chunkKey != oldChunkKey || oldPageId != pageId - (1 + skipped)) {
115  if (skipped > 0) {
116  VLOG(4) << "FId.PSz: " << fileId << "." << pageSize
117  << " Chunk key: " << show_chunk(oldChunkKey)
118  << " Page id from : " << oldPageId << " to : " << oldPageId + skipped
119  << " Epoch: " << oldVersionEpoch;
120  } else if (oldPageId != -99) {
121  VLOG(4) << "FId.PSz: " << fileId << "." << pageSize
122  << " Chunk key: " << show_chunk(oldChunkKey) << " Page id: " << oldPageId
123  << " Epoch: " << oldVersionEpoch;
124  }
125  oldPageId = pageId;
126  oldVersionEpoch = versionEpoch;
127  oldChunkKey = chunkKey;
128  skipped = 0;
129  } else {
130  skipped++;
131  }
132 
133  /* Check if version epoch is equal to
134  * or greater (note: should never be greater)
135  * than FileMgr epoch_ - this means that this
136  * page wasn't checkpointed and thus we should
137  * not use it
138  */
139  int32_t fileMgrEpoch =
140  fileMgr->epoch(chunkKey[CHUNK_KEY_DB_IDX], chunkKey[CHUNK_KEY_TABLE_IDX]);
141  if (versionEpoch > fileMgrEpoch) {
142  // First write 0 to first four bytes of
143  // header to mark as free
144  if (!g_read_only && !g_multi_instance) {
145  // Read-only mode can find pages like this if the server was previously run in
146  // write-mode but is not allowed to free them.
147  freePageImmediate(pageNum);
148  LOG(WARNING) << "Was not checkpointed: Chunk key: " << show_chunk(chunkKey)
149  << " Page id: " << pageId << " Epoch: " << versionEpoch
150  << " FileMgrEpoch " << fileMgrEpoch << endl;
151  }
152  } else { // page was checkpointed properly
153  Page page(fileId, pageNum);
154  headerVec.emplace_back(chunkKey, pageId, versionEpoch, page);
155  }
156  }
157  // printlast
158  if (oldPageId != -99) {
159  if (skipped > 0) {
160  VLOG(4) << "FId.PSz: " << fileId << "." << pageSize
161  << " Chunk key: " << show_chunk(oldChunkKey)
162  << " Page id from : " << oldPageId << " to : " << oldPageId + skipped
163  << " Epoch: " << oldVersionEpoch;
164  } else {
165  VLOG(4) << "FId.PSz: " << fileId << "." << pageSize
166  << " Chunk key: " << show_chunk(oldChunkKey) << " Page id: " << oldPageId
167  << " Epoch: " << oldVersionEpoch;
168  }
169  }
170 }
virtual int32_t epoch(int32_t db_id, int32_t tb_id) const
Returns current value of epoch - should be one greater than recorded at last checkpoint. Because FileMgr only contains buffers from one table we can just return the FileMgr&#39;s epoch instead of finding a table-specific epoch.
Definition: FileMgr.h:281
#define CHECK_EQ(x, y)
Definition: Logger.h:301
std::vector< int > ChunkKey
Definition: types.h:36
bool g_multi_instance
Definition: heavyai_locks.h:22
#define LOG(tag)
Definition: Logger.h:285
#define CHUNK_KEY_DB_IDX
Definition: types.h:38
void freePageImmediate(int32_t page_num)
Definition: FileInfo.cpp:245
#define CHECK_GE(x, y)
Definition: Logger.h:306
virtual bool updatePageIfDeleted(FileInfo *file_info, ChunkKey &chunk_key, int32_t contingent, int32_t page_epoch, int32_t page_num)
deletes or recovers a page based on last checkpointed epoch.
Definition: FileMgr.cpp:1637
std::string show_chunk(const ChunkKey &key)
Definition: types.h:98
std::set< size_t > freePages
Definition: FileInfo.h:62
size_t pageSize
file stream object for the represented file
Definition: FileInfo.h:59
#define CHUNK_KEY_TABLE_IDX
Definition: types.h:39
bool g_read_only
Definition: heavyai_locks.h:21
FILE * f
unique file identifier (i.e., used for a file name)
Definition: FileInfo.h:58
size_t numPages
the fixed size of each page in the file
Definition: FileInfo.h:60
#define VLOG(n)
Definition: Logger.h:388

+ Here is the call graph for this function:

std::string File_Namespace::FileInfo::print ( ) const

Prints a summary of the file to stdout.

Definition at line 216 of file FileInfo.cpp.

References available(), fileId, size(), and used().

216  {
217  std::stringstream ss;
218  ss << "File: " << fileId << std::endl;
219  ss << "Size: " << size() << std::endl;
220  ss << "Used: " << used() << std::endl;
221  ss << "Free: " << available() << std::endl;
222  return ss.str();
223 }
size_t used() const
Returns the amount of used bytes; size() - available()
Definition: FileInfo.h:116
size_t size() const
Returns the number of bytes used by the file.
Definition: FileInfo.h:95
size_t available() const
Returns the number of free bytes available.
Definition: FileInfo.h:102

+ Here is the call graph for this function:

size_t File_Namespace::FileInfo::read ( const size_t  offset,
const size_t  size,
int8_t *  buf 
)

Definition at line 70 of file FileInfo.cpp.

References f, file_path, File_Namespace::read(), and readWriteMutex_.

Referenced by File_Namespace::FileBuffer::copyPage(), File_Namespace::FileMgr::copyPage(), File_Namespace::FileMgr::copyPageWithoutHeaderSize(), and File_Namespace::readForThread().

70  {
71  std::lock_guard<std::mutex> lock(readWriteMutex_);
72  return File_Namespace::read(f, offset, size, buf, file_path);
73 }
std::mutex readWriteMutex_
Definition: FileInfo.h:65
std::string file_path
set of page numbers of free pages
Definition: FileInfo.h:63
size_t size() const
Returns the number of bytes used by the file.
Definition: FileInfo.h:95
size_t read(FILE *f, const size_t offset, const size_t size, int8_t *buf, const std::string &file_path)
Reads the specified number of bytes from the offset position in file f into buf.
Definition: File.cpp:125
FILE * f
unique file identifier (i.e., used for a file name)
Definition: FileInfo.h:58

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

void File_Namespace::FileInfo::recoverPage ( const ChunkKey chunk_key,
int32_t  page_num 
)

Definition at line 252 of file FileInfo.cpp.

References CHECK, g_multi_instance, pageSize, and write().

Referenced by File_Namespace::FileMgr::updatePageIfDeleted().

252  {
253  CHECK(!g_multi_instance) << "Attempted unsynchronized write in multi-instance mode";
254  write(page_num * pageSize + sizeof(int32_t),
255  2 * sizeof(int32_t),
256  reinterpret_cast<const int8_t*>(chunk_key.data()));
257 }
size_t write(const size_t offset, const size_t size, const int8_t *buf)
Definition: FileInfo.cpp:64
bool g_multi_instance
Definition: heavyai_locks.h:22
size_t pageSize
file stream object for the represented file
Definition: FileInfo.h:59
#define CHECK(condition)
Definition: Logger.h:291

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

size_t File_Namespace::FileInfo::size ( ) const
inline

Returns the number of bytes used by the file.

Definition at line 95 of file FileInfo.h.

References numPages, and pageSize.

Referenced by print(), and used().

95 { return pageSize * numPages; }
size_t pageSize
file stream object for the represented file
Definition: FileInfo.h:59
size_t numPages
the fixed size of each page in the file
Definition: FileInfo.h:60

+ Here is the caller graph for this function:

int32_t File_Namespace::FileInfo::syncToDisk ( )

Syncs file to disk via a buffer flush and then a sync (fflush and fsync on posix systems)

Definition at line 225 of file FileInfo.cpp.

References f, logger::FATAL, heavyai::fsync(), isDirty, LOG, and readWriteMutex_.

225  {
226  std::lock_guard<std::mutex> lock(readWriteMutex_);
227  if (isDirty) {
228  if (fflush(f) != 0) {
229  LOG(FATAL) << "Error trying to flush changes to disk, the error was: "
230  << std::strerror(errno);
231  }
232 #ifdef __APPLE__
233  const int32_t sync_result = fcntl(fileno(f), 51);
234 #else
235  const int32_t sync_result = heavyai::fsync(fileno(f));
236 #endif
237  if (sync_result == 0) {
238  isDirty = false;
239  }
240  return sync_result;
241  }
242  return 0; // if file was not dirty and no syncing was needed
243 }
#define LOG(tag)
Definition: Logger.h:285
std::mutex readWriteMutex_
Definition: FileInfo.h:65
FILE * f
unique file identifier (i.e., used for a file name)
Definition: FileInfo.h:58
int fsync(int fd)
Definition: heavyai_fs.cpp:62
bool isDirty
the number of pages in the file
Definition: FileInfo.h:61

+ Here is the call graph for this function:

size_t File_Namespace::FileInfo::used ( ) const
inline

Returns the amount of used bytes; size() - available()

Definition at line 116 of file FileInfo.h.

References available(), and size().

Referenced by print().

116 { return size() - available(); }
size_t size() const
Returns the number of bytes used by the file.
Definition: FileInfo.h:95
size_t available() const
Returns the number of free bytes available.
Definition: FileInfo.h:102

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

size_t File_Namespace::FileInfo::write ( const size_t  offset,
const size_t  size,
const int8_t *  buf 
)

Definition at line 64 of file FileInfo.cpp.

References f, fileMgr, isDirty, readWriteMutex_, and File_Namespace::FileMgr::writeFile().

Referenced by File_Namespace::FileBuffer::append(), File_Namespace::FileBuffer::copyPage(), File_Namespace::FileMgr::copyPage(), File_Namespace::FileMgr::copyPageWithoutHeaderSize(), freePage(), freePageImmediate(), recoverPage(), File_Namespace::FileBuffer::write(), and File_Namespace::FileBuffer::writeHeader().

64  {
65  std::lock_guard<std::mutex> lock(readWriteMutex_);
66  isDirty = true;
67  return fileMgr->writeFile(f, offset, size, buf);
68 }
std::mutex readWriteMutex_
Definition: FileInfo.h:65
size_t size() const
Returns the number of bytes used by the file.
Definition: FileInfo.h:95
FILE * f
unique file identifier (i.e., used for a file name)
Definition: FileInfo.h:58
size_t writeFile(FILE *f, const size_t offset, const size_t size, const int8_t *buf) const
Definition: FileMgr.cpp:1715
bool isDirty
the number of pages in the file
Definition: FileInfo.h:61

+ Here is the call graph for this function:

+ Here is the caller graph for this function:

Member Data Documentation

FILE* File_Namespace::FileInfo::f

unique file identifier (i.e., used for a file name)

Definition at line 58 of file FileInfo.h.

Referenced by openExistingFile(), read(), syncToDisk(), write(), and ~FileInfo().

std::string File_Namespace::FileInfo::file_path

set of page numbers of free pages

Definition at line 63 of file FileInfo.h.

Referenced by read().

FileMgr* File_Namespace::FileInfo::fileMgr

Definition at line 56 of file FileInfo.h.

Referenced by freePage(), openExistingFile(), and write().

std::set<size_t> File_Namespace::FileInfo::freePages
std::mutex File_Namespace::FileInfo::freePagesMutex_
mutable

Definition at line 64 of file FileInfo.h.

Referenced by freePageDeferred(), getFreePage(), getFreePages(), and numFreePages().

bool File_Namespace::FileInfo::isDirty {false}

the number of pages in the file

Definition at line 61 of file FileInfo.h.

Referenced by syncToDisk(), and write().

size_t File_Namespace::FileInfo::numPages

the fixed size of each page in the file

Definition at line 60 of file FileInfo.h.

Referenced by initNewFile(), openExistingFile(), and size().

size_t File_Namespace::FileInfo::pageSize

file stream object for the represented file

Definition at line 59 of file FileInfo.h.

Referenced by available(), File_Namespace::FileMgr::copyPageWithoutHeaderSize(), freePage(), freePageImmediate(), openExistingFile(), recoverPage(), and size().

std::mutex File_Namespace::FileInfo::readWriteMutex_
mutable

Definition at line 65 of file FileInfo.h.

Referenced by read(), syncToDisk(), and write().


The documentation for this struct was generated from the following files: