FACTOID # 85: What is in a name? More than 90% of people in Bhutan, Burundi and Burkina Faso are involved in agriculture.
 
 Home   Encyclopedia   Statistics   Countries A-Z   Flags   Maps   Education   Forum   FAQ   About 
 
WHAT'S NEW
RECENT ARTICLES
More Recent Articles »
 

SEARCH ALL

FACTS & STATISTICS   

Search encyclopedia, statistics and forums:

 

 

(* = Graphable)

 

 


Encyclopedia > Binary and text files

Computer files can be divided into two broad categories: binary and text. Text files are files which contain ordinary textual characters with essentially no formatting; binary files are all other files. Or, rather, text files are a special case of binary files, since any file is fundamentally a sequence of bits, and many computer components (for example, all hard disk circuitry and most system software) make no distinction between file types. However, a large percentage of application programs can understand and use text files in some way, but few programs can typically understand and use the contents of a particular binary file. Hence the distinction can be useful to computer users. A file in a computer system is a stream (sequence) of bits stored as a single unit, typically in a file system on disk or magnetic tape. ... This article is about the unit of information. ... Typical hard drives of the mid-1990s. ... In computing, an operating system (OS) is the system software responsible for the direct control and management of hardware and basic system operations. ... Application software is a loosely defined subclass of computer software that employs the capabilities of a computer directly to a task that the user wishes to perform. ...


Text files

Text files (or plain text files) are files where most bytes (or short sequences of bytes) represent ordinary readable characters such as letters, digits, and punctuation (including space), and include some control characters such as tabs, line feeds and carriage returns. This simplicity allows a wide variety of programs to display their contents. A byte is commonly used as a unit of storage measurement in computers, regardless of the type of data being stored. ... In computing, a control character or non-printing character, is a code point (a number) in a character set that does not, in itself, represent a written symbol. ... The tab key on a modern Windows keyboard The tab key on a keyboard is used to advance the cursor to the next tab stop. // Typewriters When a typist wanted to type a table, there was a lot of time-consuming and repetitive use of the space bar and backspace... In computing, line feed (LF) is a control character indicating that one line should be fed out. ... Originally, carriage return was the term for the key, lever, or mechanism on a typewriter that would cause the cylinder on which the paper was held (the carriage) to return to the left side of the paper after a line of text had been typed, and would often move it...


The similar term plaintext is most commonly used in a cryptographic context. The similarity sometimes causes confusion, especially among those new to computers, cryptography, or data communications... The plain text term has a different meaning. ... Cryptography has had a long and colourful history. ...


Generally, a text file contains characters in an ASCII-based encoding, or much less commonly an EBCDIC-based encoding, without any embedded information such as font information, hyperlinks or inline images. Text files are often encoded in an extension of ASCII; these include ISO 8859, EUC, a special encoding for Windows, a special encoding for Mac OS, and Unicode encoding schemes (common on many platforms) such as UTF-8 or UTF-16. There are 95 printable ASCII characters, numbered 32 to 126. ... EBCDIC (Extended Binary Coded Decimal Interchange Code) is an 8-bit character encoding (code page) used on IBM mainframe operating systems, like z/OS, OS/390, VM and VSE, as well as IBM minicomputer operating systems like OS/400 and i5/OS. It is also employed on various non-IBM... This article is in need of attention from an expert on the subject. ... A hyperlink, or simply a link, is a reference in a hypertext document to another document or other resource. ... See also: Photograph In common usage, an image (from Latin imago) or picture is an artifact that reproduces the likeness of some subject—usually a physical object or a person. ... ISO 8859, more formally ISO/IEC 8859, is a joint ISO and IEC standard for 8-bit character encodings for use by computers. ... Extended Unix Coding (EUC) is an 8-bit character encoding used primarily for Japanese and Korean. ... Code page is the traditional IBM term used for a specific character encoding table: a mapping in which a sequence of bits, usually a single octet representing integer values 0 through 255, is associated with a specific character. ... Microsoft Windows is a range of operating environments and operating systems for personal computers and servers. ... Mac OS, which stands for Macintosh Operating System, is a range of graphical user interface-based operating systems developed by Apple Computer for the Macintosh computers. ... Unicode is an industry standard whose goal is to provide the means by which text of all forms and languages can be encoded for use by computers. ... UTF-8 (8-bit Unicode Transformation Format) is a variable-length character encoding for Unicode created by Ken Thompson and Rob Pike. ... In computing, UTF-16 is a 16-bit Unicode Transformation Format, a character encoding form that provides a way to represent a series of abstract characters from Unicode and ISO/IEC 10646 as a series of 16-bit words suitable for storage or transmission via data networks. ...


Although many text files are generally meant for humans to read, some are (also) used for data storage by computer programs. Text files are sometimes advantageous even for data storage because they avoid certain problems with binary files, such as endianness, padding bytes, or differences in the number of bytes in a machine word. Endianness generally refers to whichever of two arbitrary sequencing methods are used in a one-dimensional system (such as writing or computer memory). ... In computing, word is a term for the natural unit of data used by a particular computer design. ...


Plain text is often used as a readable representation of other data that is not itself purely textual: for example, a formatted webpage is not plain text, but its HTML source is. Similarly, source code for computer programs is usually stored in text files, but is compiled into a binary form for execution. A webpage or web page is a page of the World Wide Web, usually in HTML/XHTML format (the file extensions are typically htm or html) and with hypertext links to enable navigation from one page or section to another. ... In computing, HyperText Markup Language (HTML) is a markup language designed for the creation of web pages and other information viewable in a browser. ... Source code (commonly just source or code) is any series of statements written in some human-readable computer programming language. ... A compiler is a computer program that translates a computer program written in one computer language (called the source language) into an equivalent program written in another computer language (called the output or the target language). ...


Text files usually have the MIME type "text/plain", usually with additional information indicating an encoding. Prior to the advent of Mac OS X, the Mac OS system regarded the content of a file (the data fork) to be a text file when its resource fork indicated that the type of the file was "TEXT". The Windows system regards a file to be a text file if the suffix of the name of the file is "txt". However, source code for computer programs are also text, but usually have file name suffixes indicating which programming language the source is written in. Multipurpose Internet Mail Extensions (MIME) is an Internet Standard for the format of e-mail. ... Mac OS X (pronounced Mac OS Ten) is an operating system designed and developed by Apple Computer for use on their current line of Macintosh computers. ...


Unix, Macintosh, Microsoft Windows, and DOS differ not only in which character encodings are common on the platform, but also in which line ending convention is most common on the platform. See new line for a discussion of this. Wikibooks has more about this subject: Guide to UNIX Unix or UNIX is a computer operating system originally developed in the 1960s and 1970s by a group of AT&T Bell Labs employees including Ken Thompson, Dennis Ritchie, and Douglas McIlroy. ... The first Macintosh computer, introduced in 1984. ... Microsoft Windows is a range of operating environments and operating systems for personal computers and servers. ... The acronym DOS stands for disk operating system, an operating system component for computers that provides the abstraction of a file system resident on hard disk or floppy disk secondary storage. ... In computing, a newline is a special character or sequence of characters signifying the end of a line of text. ...


Binary files

Binary files, in contrast, may contain any data whatsoever (including plain text, since binary file is a more general concept), and usually mostly contain bytes that should not be directly interpreted as characters. Compiled computer programs are typical examples, as the data and CPU instructions they contain can — in principle — be any binary value. As a result, compiled applications (object files) are sometimes referred to as binaries. But binary files can also be image files, sound files, compressed versions of other files (of either type), etc. — in short, any file content whatsoever. Many binary file formats contain parts that are plain text. This article needs to be cleaned up to conform to a higher standard of quality. ... COMPILE COMPILE was a Japanese Video Games company that produced various memorable games for various systems. ... In computer science, object file or object code is an intermediate representation of code generated by a compiler after it processes a source code file. ... A file format is a particular way to encode information for storage in a computer file. ...


To send binary files through certain systems (such as e-mail) that do not allow all data values, they are often translated into a plain text representation (using, for example, Base64). This encoding has the disadvantage of increasing the file's size by approximately 30% during the transfer, as well as requiring translation back into binary after receipt. See ASCII armor for more on this subject. Wikipedia does not yet have an article with this exact name. ... Base 64 literally means a positional numbering system using a base of 64. ... ASCII Armor is a term used to describe an encoding process, in which data in a binary format is transformed into a textual format, to allow the data to be successfully transmitted through channels designed only for text messages, such as e-mail or usenet. ...


Binary is nothing more than a numeral system. Binary files are usually thought of as being a sequence of bytes, which means the binary digits (bits) are grouped in eights. If you open a binary file in a text editor, each group of eight bits will be translated as a single character, and you will see a (probably unintelligible) display of textual characters. If you were to open it in some other application, that application will have its own use for each byte: maybe the application will treat each byte as a number and output a stream of numbers between 0 and 255 — or maybe interpret the numbers in the bytes as colors and display the corresponding picture. If the file is itself treated as an executable and run, then the computer will attempt to interpret the file as a series of instructions in its machine language. roger la crevette A numeral is a symbol or group of symbols that represents a number. ... A byte is commonly used as a unit of storage measurement in computers, regardless of the type of data being stored. ... This article is about the unit of information. ... Notepad is the standard text editor for Microsoft Windows A text editor is a piece of computer software for editing plain text. ... A system of codes directly understandable by a computers CPU is termed this CPUs native or machine language. ...


See binary numeral system to understand how you can convert eight bits into a "normal" decimal number. The binary numeral system represents numeric values using two symbols, typically 0 and 1. ...


Related Links


  Results from FactBites:
 
Binary versus ASCII (Plain Text) Files (736 words)
At heart all files are binary files -- that is, a collection of 1s and 0s.
This is the simplest file - the ASCII codes for the letters 'hello' followed by the ASCII codes for a carriage return and line feed.
Since this file contains mixed sections of ASCII and Unicode, it is crucial that the file positions are left unchanged.
Binary Software at FilesLand (495 words)
Image2db is a utility for uploading large binary and text data to the database.
SplitMe, is an universal binary file splitter and self-rejoin exe generator.
Binary Viewer is is a free windows utility allowing you to open and view any file located on your computer regardless of format file was saved.
  More results at FactBites »

 

COMMENTARY     


Share your thoughts, questions and commentary here
Your name
Your location
Your comments
Please enter the 5-letter protection code


Lesson Plans | Student Area | Student FAQ | Reviews | Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms.