208 lines
7.0 KiB
Plaintext
208 lines
7.0 KiB
Plaintext
Metadata-Version: 2.1
|
||
Name: extract-msg
|
||
Version: 0.23.1
|
||
Summary: Extracts emails and attachments saved in Microsoft Outlook's .msg files
|
||
Home-page: https://github.com/mattgwwalker/msg-extractor
|
||
Author: Matthew Walker & The Elemental of Creation
|
||
Author-email: mattgwwalker@gmail.com, arceusthe@gmail.com
|
||
License: GPL
|
||
Download-URL: https://github.com/mattgwwalker/msg-extractor/archives/master
|
||
Platform: UNKNOWN
|
||
Requires-Dist: imapclient (==2.1.0)
|
||
Requires-Dist: olefile (==0.46)
|
||
Requires-Dist: tzlocal (==1.5.1)
|
||
|
||
|License: GPL v3| |PyPI3| |PyPI1| |PyPI2|
|
||
|
||
msg-extractor
|
||
=============
|
||
|
||
Extracts emails and attachments saved in Microsoft Outlook’s .msg files
|
||
|
||
The python package extract_msg automates the extraction of key email
|
||
data (from, to, cc, date, subject, body) and the email’s attachments.
|
||
|
||
- `Changelog <CHANGELOG.md>`__
|
||
|
||
Usage
|
||
-----
|
||
|
||
**To use it as a command-line script**:
|
||
|
||
::
|
||
|
||
python extract_msg example.msg
|
||
|
||
This will produce a new folder named according to the date, time and
|
||
subject of the message (for example “2013-07-24_0915 Example”). The
|
||
email itself can be found inside the new folder along with the
|
||
attachments.
|
||
|
||
The script uses Philippe Lagadec’s Python module that reads Microsoft
|
||
OLE2 files (also called Structured Storage, Compound File Binary Format
|
||
or Compound Document File Format). This is the underlying format of
|
||
Outlook’s .msg files. This library currently supports up to Python 2.7
|
||
and 3.4.
|
||
|
||
The script was built using Peter Fiskerstrand’s documentation of the
|
||
.msg format. Redemption’s discussion of the different property types
|
||
used within Extended MAPI was also useful. For future reference, I note
|
||
that Microsoft have opened up their documentation of the file format.
|
||
|
||
|
||
#########REWRITE COMMAND LINE USAGE#############
|
||
Currently, the README is in the process of being redone. For now, please
|
||
refer to the usage information provided from the program's help dialog:
|
||
::
|
||
usage: extract_msg [-h] [--use-content-id] [--dev] [--validate] [--json]
|
||
[--file-logging] [--verbose] [--log LOG]
|
||
[--config CONFIG_PATH] [--out OUT_PATH] [--use-filename]
|
||
msg [msg ...]
|
||
|
||
extract_msg: Extracts emails and attachments saved in Microsoft Outlook's .msg
|
||
files. https://github.com/mattgwwalker/msg-extractor
|
||
|
||
positional arguments:
|
||
msg An msg file to be parsed
|
||
|
||
optional arguments:
|
||
-h, --help show this help message and exit
|
||
--use-content-id, --cid
|
||
Save attachments by their Content ID, if they have
|
||
one. Useful when working with the HTML body.
|
||
--dev Changes to use developer mode. Automatically enables
|
||
the --verbose flag. Takes precedence over the
|
||
--validate flag.
|
||
--validate Turns on file validation mode. Turns off regular file
|
||
output.
|
||
--json Changes to write output files as json.
|
||
--file-logging Enables file logging. Implies --verbose
|
||
--verbose Turns on console logging.
|
||
--log LOG Set the path to write the file log to.
|
||
--config CONFIG_PATH Set the path to load the logging config from.
|
||
--out OUT_PATH Set the folder to use for the program output.
|
||
(Default: Current directory)
|
||
--use-filename Sets whether the name of each output is based on the
|
||
msg filename.
|
||
|
||
**To use this in your own script**, start by using:
|
||
|
||
::
|
||
|
||
import extract_msg
|
||
|
||
From there, initialize an instance of the Message class:
|
||
|
||
::
|
||
|
||
msg = extract_msg.Message("path/to/msg/file.msg")
|
||
|
||
Alternatively, if you wish to send a msg binary string instead of a file
|
||
to the ExtractMsg.Message Method:
|
||
|
||
::
|
||
|
||
msg_raw = b'\xd0\xcf\x11\xe0\xa1\xb1\x1a\xe1\x00 ... \x00x00x00'
|
||
msg = extract_msg.Message(msg_raw)
|
||
|
||
If you want to override the default attachment class and use one of your
|
||
own, simply change the code to:
|
||
|
||
::
|
||
|
||
msg = extract_msg.Message("path/to/msg/file.msg", attachmentClass = CustomAttachmentClass)
|
||
|
||
where ``CustomAttachmentClass`` is your custom class.
|
||
|
||
#TODO: Finish this section
|
||
|
||
If you have any questions feel free to contact me, Matthew Walker, at
|
||
mattgwwalker at gmail.com. NOTE: Due to time constraints, The Elemental
|
||
of Creation has been added as a contributor to help manage the project.
|
||
As such, it may be helpful to send emails to arceusthe@gmail.com as
|
||
well.
|
||
|
||
If you have issues, it would be best to get help for them by opening a
|
||
new github issue.
|
||
|
||
Error Reporting
|
||
---------------
|
||
|
||
Should you encounter an error that has not already been reported, please
|
||
do the following when reporting it: \* Make sure you are using the
|
||
latest version of extract_msg. \* State your Python version. \* Include
|
||
the code, if any, that you used. \* Include a copy of the traceback.
|
||
|
||
Installation
|
||
------------
|
||
|
||
You can install using pip:
|
||
|
||
- Pypi
|
||
|
||
.. code:: bash
|
||
|
||
pip install extract-msg
|
||
|
||
- Github
|
||
|
||
.. code:: sh
|
||
|
||
pip install git+https://github.com/mattgwwalker/msg-extractor
|
||
|
||
or you can include this in your list of python dependencies with:
|
||
|
||
.. code:: python
|
||
|
||
# setup.py
|
||
|
||
setup(
|
||
...
|
||
dependency_links=['https://github.com/mattgwwalker/msg-extractor/zipball/master'],
|
||
)
|
||
|
||
Todo
|
||
----
|
||
|
||
Here is a list of things that are currently on our todo list:
|
||
|
||
* Tests (ie. unittest)
|
||
* Finish writing a usage guide
|
||
* Improve the intelligence of the saving functions
|
||
* Provide a way to save attachments and messages into a custom location under a custom name
|
||
* Implement better property handling that will convert each type into a python equivalent if possible
|
||
* Implement handling of named properties
|
||
* Improve README
|
||
* Create a wiki for advanced usage information
|
||
|
||
Credits
|
||
-------
|
||
|
||
`Matthew Walker`_ - Original developer and owner
|
||
|
||
`Ken Peterson (The Elemental of Creation)`_ - Principle programmer, manager, and msg file "expert"
|
||
|
||
`JP Bourget`_ - Senior programmer, readability and organization expert, secondary manager
|
||
|
||
`Philippe Lagadec`_ - Python OleFile module developer
|
||
|
||
Joel Kaufman - First implementations of the json and filename flags
|
||
|
||
`Dean Malmgren`_ - First implementation of the setup.py script
|
||
|
||
.. |License: GPL v3| image:: https://img.shields.io/badge/License-GPLv3-blue.svg
|
||
:target: LICENSE.txt
|
||
.. |PyPI3| image:: https://img.shields.io/badge/pypi-0.23.0-blue.svg
|
||
:target: https://pypi.org/project/extract-msg/0.23.0/
|
||
.. |PyPI1| image:: https://img.shields.io/badge/python-2.7+-brightgreen.svg
|
||
:target: https://www.python.org/downloads/release/python-2715/
|
||
.. |PyPI2| image:: https://img.shields.io/badge/python-3.6+-brightgreen.svg
|
||
:target: https://www.python.org/downloads/release/python-367/
|
||
.. _Matthew Walker: https://github.com/mattgwwalker
|
||
.. _Ken Peterson (The Elemental of Creation): https://github.com/TheElementalOfCreation
|
||
.. _JP Bourget: https://github.com/punkrokk
|
||
.. _Philippe Lagadec: https://github.com/decalage2
|
||
.. _Dean Malmgren: https://github.com/deanmalmgren
|
||
|
||
|