Tuesday, 23 May 2017

Setup python environment using anaconda and Jupyter

Anaconda is the leading open data science platform powered by Python. The open source version of Anaconda is a high performance distribution of Python and R and includes over 100 of the most popular Python, R and Scala packages for data science.
Additionally, you'll have access to over 720 packages that can easily be installed with conda, our renowned package, dependency and environment manager, that is included in Anaconda. See the packages included with Anaconda and the Anaconda changelog

1.What is the Jupyter Notebook?

In this page briefly introduce the main components of the Jupyter Notebook environment. For a more complete overview see References.

1.1. Notebook document

Notebook documents (or “notebooks”, all lower case) are documents produced by the Jupyter Notebook App, which contain both computer code (e.g. python) and rich text elements (paragraph, equations, figures, links, etc...). Notebook documents are both human-readable documents containing the analysis description and the results (figures, tables, etc..) as well as executable documents which can be run to perform data analysis.

1.2. Jupyter Notebook App

The Jupyter Notebook App is a server-client application that allows editing and running notebook documents via a web browser. The Jupyter Notebook App can be executed on a local desktop requiring no internet access (as described in this document) or can be installed on a remote server and accessed through the internet.
In addition to displaying/editing/running notebook documents, the Jupyter Notebook App has a “Dashboard” (Notebook Dashboard), a “control panel” showing local files and allowing to open notebook documents or shutting down their kernels.

1.3. kernel

A notebook kernel is a “computational engine” that executes the code contained in a Notebook document. The ipython kernel, referenced in this guide, executes python code. Kernels for many other languages exist (official kernels).
When you open a Notebook document, the associated kernel is automatically launched. When the notebook is executed (either cell-by-cell or with menu Cell -> Run All), the kernel performs the computation and produces the results. Depending on the type of computations, the kernel may consume significant CPU and RAM. Note that the RAM is not released until the kernel is shut-down.

1.4. Notebook Dashboard

The Notebook Dashboard is the component which is shown first when you launch Jupyter Notebook App. The Notebook Dashboard is mainly used to open notebook documents, and to manage the running kernels (visualize and shutdown).
The Notebook Dashboard has other features similar to a file manager, namely navigating folders and renaming/deleting files.

2. Installation

2.1. Step 0: The browser

Step “zero” consists in installing a modern standard-compliant browser. Either Mozilla Firefox or Google Chrome will work well. Try to avoid MS Explorer.

2.2. Step 1: Installation

The easiest way to install the Jupyter Notebook App consists in installing a scientific python distribution which includes it. In this guide, we will use the Anaconda distribution created by Continuum. Note that Anaconda currently (mid 2015) still uses the old name IPython Notebook instead of Jupyter Notebook App but the software is the same.

Download Continuum Anaconda (free version, approx. 400MB), python 3, 64 bits.
Install it using the default settings for a single user.

3. Running the Jupyter Notebook

3.1. Launching Jupyter Notebook App

The Jupyter Notebook App can be launched by clicking on the Jupyter Notebook icon installed by Anaconda in the start menu (Windows) or by typing in a terminal (cmd on Windows):

jupyter notebook

This will launch a new browser window (or a new tab) showing the Notebook Dashboard, a sort of control panel that allows (among other things) to select which notebook to open.
When started, the Jupyter Notebook App can access only files within its start-up folder (including any sub-folder). If you store the notebook documents in a subfolder of your user folder no configuration is necessary. Otherwise, you need to choose a folder which will contain all the notebooks and set this as the Jupyter Notebook App start-up folder.
See below for platform-specific instructions on how to start Jupyter Notebook App in a specific folder.

3.1.1. Change Jupyter Notebook startup folder (Windows)

Copy the Jupyter Notebook launcher from the menu to the desktop.
Right click on the new launcher and change the “Start in” field by pasting the full path of the folder which will contain all the notebooks.
Double-click on the Jupyter Notebook desktop launcher (icon shows [IPy]) to start the Jupyter Notebook App, which will open in a new browser window (or tab). Note also that a secondary terminal window (used only for error logging and for shut down) will be also opened. If only the terminal starts, try opening this address with your browser: http://localhost:8888/.

3.1.2. Change Jupyter Notebook startup folder (OS X)

To launch Jupyter Notebook App:

Click on spotlight, type terminal to open a terminal window.
Enter the startup folder by typing cd/some_folder_name.
Type jupyter notebook to launch the Jupyter Notebook App (it will appear in a new browser window or tab).

3.2. Shut down the Jupyter Notebook App

In a nutshell, closing the browser (or the tab) will not close the Jupyter Notebook App. To completely shut it down you need to close the associated terminal.
In more detail, the Jupyter Notebook App is a server that appears in your browser at a default address (http://localhost:8888). Closing the browser will not shut down the server. You can reopen the previous address and the Jupyter Notebook App will be redisplayed.
You can run many copies of the Jupyter Notebook App and they will show up at a similar address (only the number after ”:”, which is the port, will increment for each new copy). Since with a single Jupyter Notebook App you can already open many notebooks, we do not recommend running multiple copies of Jupyter Notebook App.

3.3. Close a notebook: kernel shut down

When a notebook is opened, its “computational engine” (called the kernel) is automatically started. Closing the notebook browser tab, will not shut down the kernel, instead the kernel will keep running until is explicitly shut down.
To shut down a kernel, go to the associated notebook and click on menu File -> Close and Halt. Alternatively, the Notebook Dashboard has a tab named Running that shows all the running notebooks (i.e. kernels) and allows shutting them down (by clicking on a Shutdown button).

3.4. Executing a notebook

Download the notebook you want to execute and put it in your notebook folder (or a sub-folder of it).
Then follow these steps:

Launch the Jupyter Notebook App (see previous section).
In the Notebook Dashboard navigate to find the notebook: clicking on its name will open it in a new browser tab.
Click on the menu Help -> User Interface Tour for an overview of the Jupyter Notebook App user interface.
You can run the notebook document step-by-step (one cell a time) by pressing shift + enter.
You can run the whole notebook in a single step by clicking on the menu Cell -> Run All.
To restart the kernel (i.e. the computational engine), click on the menu Kernel -> Restart. This can be useful to start over a computation from scratch (e.g. variables are deleted, open files are closed, etc...).

More information on editing a notebook:

Notebook Basics (or alternate link)

Moving machine learning from practice to production

With growing interest in neural networks and deep learning, individuals and companies are claiming ever-increasing adoption rates of artificial intelligence into their daily workflows and product offerings.
Coupled with breakneck speeds in AI-research, the new wave of popularity shows a lot of promise for solving some of the harder problems out there.
That said, I feel that this field suffers from a gulf between appreciating these developments and subsequently deploying them to solve "real-world" tasks.
A number of frameworks, tutorials and guides have popped up to democratize machine learning, but the steps that they prescribe often don't align with the fuzzier problems that need to be solved.
This post is a collection of questions (with some (maybe even incorrect) answers) that are worth thinking about when applying machine learning in production.

Garbage in, garbage out

Do I have a reliable source of data? Where do I obtain my dataset?

While starting out, most tutorials usually include well-defined datasets. Whether it be MNIST, the Wikipedia corpus or any of the great options from the UCI Machine Learning Repository, these datasets are often not representative of the problem that you wish to solve.
For your specific use case, an appropriate dataset might not even exist and building a dataset could take much longer than you expect.
For example, at Semantics3, we tackle a number of ecommerce-specific problems ranging from product categorization to product matching to search relevance. For each of these problems, we had to look within and spend considerable effort to generate high-fidelity product datasets.
In many cases, even if you possess the required data, significant (and expensive) manual labor might be required to categorize, annotate and label your data for training.

Transforming data to input

What pre-processing steps are required? How do I normalize my data before using with my algorithms?

This is another step, often independent of the actual models, that is glossed over in most tutorials. Such omissions appear even more glaring when exploring deep neural networks, where transforming the data into usable "input" is crucial.
While there exist some standard techniques for images, like cropping, scaling, zero-centering and whitening - the final decision is still up to individuals on the level of normalization required for each task.
The field gets even messier when working with text. Is capitalization important? Should I use a tokenizer? What about word embeddings? How big should my vocabulary and dimensionality be? Should I use pre-trained vectors or start from scratch or layer them?
There is no right answer applicable across all situations, but keeping abreast of available options is often half the battle. A recent post from the creator of spaCy details an interesting strategy to standardize deep learning for text.

Now, let's begin?

Which language/framework do I use? Python, R, Java, C++? Caffe, Torch, Theano, Tensorflow, DL4J?

This might be the question with the most opinionated answers. I am including this section here only for completeness and would gladly point you to the various other resources available for making this decision.
While each person might have different criteria for evaluation, mine has simply been ease of customization, prototyping and testing. In that aspect, I prefer to start with scikit-learn where possible and use Keras for my deep learning projects.
Further questions like Which technique should I use? Should I use deep or shallow models, what about CNNs/RNNs/LSTMs? Again, there are a number of resources to help make decisions and this is perhaps the most discussed aspect when people talk about "using" machine learning.

Training models

How do I train my models? Should I buy GPUs, custom hardware, or ec2 (spot?) instances? Can I parallelize them for speed?

With ever-rising model complexity, and increasing demands on processing power, this is an unavoidable question when moving to production.
A billion-parameter network might promise great performance with its terabyte-sized dataset, but most people cannot afford to wait for weeks while the training is still in-progress.
Even with simpler models, the infrastructure and tooling required for the build-up, training, collation and tear-down of tasks across instances can be quite daunting.
Spending some time on planning your infrastructure , standardizing setup and defining workflows early-on can save valuable time with each additional model that you build.

No system is an island

Do I need to make batched or real-time predictions? Embedded models or interfaces? RPC or REST?

Your 99%-validation-accuracy model is not of much use unless it interfaces with the rest of your production system. The decision here is at least partially driven by your use-case.
A model performing a simple task might perform satisfactorily with its weights packaged directly into your application, while more complicated models might require communication with centralized heavy-lifting servers.
In our case, most of our production systems perform tasks offline in batches, while a minority serve real-time predictions via JSON-RPC over HTTP.
Knowing the answer to these questions might also restrict the types of architectures that you should consider when building your models. Building a complex model, only to later learn that it cannot be deployed within your mobile app is a disaster that can be easily avoided.

Monitoring performance

How do I keep track of my predictions? Do I log my results to a database? What about online learning?

After building, training and deploying your models to production, the task is still not complete unless you have monitoring systems in place. A crucial component to ensuring the success of your models is being able to measure and quantify their performance. A number of questions are worth answering in this area.
How does my model affect the overall system performance? Which numbers do I measure? Does the model correctly handle all possible inputs and scenarios?
Having used Postgres in the past, I favor using it for monitoring my models. Periodically saving production statistics (data samples, predicted results, outlier specifics) has proven invaluable in performing analytics (and error postmortems) over deployments.
Another import aspect to consider is the online-learning requirement of your model. Should your model learn new features on the fly? When hoverboards become a reality, should the product-categorizer put it in Vehicles, Toys or leave it Uncategorized? Again, these are important questions worth debating when building your system.

Wrapping it up

there is more to it than just the secret sauce

This post poses more questions than it answers, but that was sort of the point really. With many advances in new techniques and cells and layers and network architectures, it is easier than ever to miss the forest for the trees.
Greater discussion about end-to-end deployments is required among practitioners to take this field forward and truly democratize machine learning for the masses.

https://www.fullstackpython.com/deployment.html
https://engineering.semantics3.com/2016/11/13/machine-learning-practice-to-production/

Friday, 3 March 2017

Disable the systems reports in CRM

There are two ways, first through the solution, hide it by removing the display options:

Remove all the display in area

And do not to save and publish (for report I think it's optional, but can do it also)

Then, other way, you can change the Visibility by changing the Ownership type from organization to individual.

But, you cannot do it through solution, instead, go to the Sales -> Reports

Then click Edit, change the Viewable..

Once you change the viewable, only the individual owner can see this report, unless the owner share it out.

https://community.dynamics.com/crm/f/117/t/149132

Wednesday, 1 March 2017

MSCRM - You do not have sufficient privileges to view this chart

How to fix the error "You do not have sufficient privileges to view this chart”

Recently, I had an issue where a customer logged in to Microsoft Dynamics CRM and could not see anything AT ALL. All links to accounts, contacts, calendars, and every other record type was missing. Third-party add-ons, such as Powertrak still appeared, but no native CRM entities were available.
The Dashboard displayed the error, “User does not have sufficient privileges to view this chart. Contact your systems administrator.”

This error was caused by someone erroneously changing the users’ “Access Mode” and “License Type” to “Administrative.” This gave them the ability to only make administrative changes to CRM itself, with no access to records.
Normal CRM usage has an Access Type of “Read-Write” and a License Type of “Full” or “Limited.”

To resolve this issue:

Log in as another user with Administrator rights.
Click Settings -> Administration -> Users -> Open the user in question
Look at the settings under Client Access License (CAL) information.
Set them correctly for the user. Most likely, they’ll need to be “Read-Write” and “Full,” respectively
Sign out, sign back in, and things should be fine!

I have seen this issue pop up a few times, but no one ever seems to have a reason why it happens. It could be anything from someone messing with records to an erroneous data import. Once fixed, make sure to ask your co-workers to see if anyone was doing anything that could cause this.

Reference

https://community.dynamics.com/crm/f/117/t/177241http://www.axonom.com/troubleshooting-you-do-not-have-sufficient-privileges-to-view-this-chart

MSBI TIPS - Collection of dailly notes